Subject tracking systems for a movable imaging system

ABSTRACT

A method is provided for controlling a movable imaging assembly having a movable platform and an imaging device coupled to and movable relative to the movable platform. The method includes receiving user inputs that define an MIA position relative to a target and a frame position of the target within image frames captured by the imaging device. The user inputs include a horizontal distance, a circumferential position, and a horizontal distance that define the MIA position, and include a horizontal frame position and a vertical frame position that define the frame position. The method further includes predicting a future position of the target for a future time, and moving the MIA to be in the MIA position at the future time and moving the imaging device for the target to be in the frame position for an image frame captured at the future time.

CROSS-REFERENCE TO RELATED APPLICATION(S)

This application is a continuation-in-part of U.S. application Ser. No.15/656,559, filed Jul. 21, 2017, which claims priority to and thebenefit of U.S. Provisional Application No. 62/364,960, filed Jul. 21,2016, and U.S. Provisional Application No. 62/372,549, filed Aug. 9,2016, the entire disclosures of which are incorporated by referenceherein.

COPYRIGHT

A portion of the disclosure of this patent document contains materialthat is subject to copyright protection. The copyright owner has noobjection to the facsimile reproduction by anyone of the patent documentor the patent disclosure, as it appears in the Patent and TrademarkOffice patent files or records, but otherwise reserves all copyrightrights whatsoever.

TECHNICAL FIELD

The present disclosure relates to subject tracking systems for a movableimaging platform including enhancements to location prediction,trajectory generation, voice command recognition, compositionaltechnique, and system architecture and data-flow for tracking andsynchronization.

BACKGROUND

It is desirable in many circumstances to be able to track a particularsubject when recording video. Providing tracking commands to a movableimaging platform using manually operated controls may be too difficultand complex in certain situations, such as a situation where theoperator of the movable imaging platform is also a subject to betracked.

A tracking system works best when locations of the movable imagingplatform and subject can be accurately known. Global Positioning Systemreceivers can be utilized to provide a reasonable degree of accuracy,but they are not ideal in all circumstances.

It is also desirable in many circumstances to be able to track aparticular subject when recording video. Once a subject has beenidentified in a video stream by a subject tracking system, the trackingsystem may automatically or semi-automatically frame the subject withinthe video. Furthermore, it may be desirable to limit the region in whichan aerial-based subject tracking system operates in order ensure thesafety of the user and at the same time ensure that the tracking systemcontinues to function robustly.

SUMMARY

A movable imaging system may include a movable imaging assembly (MIA),such as an unmanned aerial vehicle (UAV), that has a movable imagingdevice, such as a camera, attached to it. The movable imaging system mayalso include a controller or external device that is communicativelyconnected to the MIA using, e.g., a wireless link.

According to an implementation, a method is provided for tracking asubject with an imaging system forming a part of a movable imagingassembly. The method includes capturing an image frame using an imagingsensor of the imaging system and locating the subject within a region ofinterest in the image frame. The region of interest is determinedutilizing a motion model and data from a sensor associated with thesubject or the movable imaging assembly. The method can also includetransferring the image frame to an external device that is connected tothe MIA, displaying the transferred image frame on an external displayof the external device, and displaying a bounding box around the subjectin a position based on a position of the region of interest.

According to another implementation, a method is provided for tracking asubject with an imaging system forming a part of a movable imagingassembly. The method includes capturing a first image frame using animaging sensor of the imaging system and locating the subject within thefirst image frame at a first set of frame coordinates. The method thenincludes capturing a second image frame using the imaging sensor andlocating the subject within the second image frame at a second set offrame coordinates. The method further includes capturing a third imageframe using the imaging sensor, determining a third set of framecoordinates at which the subject is predicted to be using a motion modeland based on the first frame coordinates and the second framecoordinates, and defining a region of interest having a predefinedboundary based on the third set of frame coordinates. Finally, themethod includes locating the subject by searching within the region ofinterest.

According to another implementation, a method is provided for tracking asubject with an imaging system forming part of an MIA. The methodincludes specifying a constraint on movement that limits motion of theMIA relative to a frame of reference that is the target or a fixedglobal positioning satellite system frame and moving the MIA inaccordance with the specified constraints while capturing image frameswith an image sensor of the imaging system.

According to another implementation, a method is provided for tracking atarget with an imaging system forming part of an MIA. The methodincludes defining a movable first volume positioned relative to thetarget having a first boundary within which the MIA may allowably moveduring flight. The method then includes defining a movable second volumepositioned relative to the target and contained within the first volumehaving a second boundary within which the MIA may not allowably moveduring flight. The method further includes receiving, by the MIA, amovement command to a trajectory point within the second volume andmoving the MIA to a modified trajectory point within the first volumethat is not within the second volume and that is proximate to thetrajectory point. Finally, the method includes capturing an image withan image sensor of the imaging system while the MIA is at the modifiedtrajectory point.

According to another implementation, a method is provided for tracking atarget with an imaging system forming part of an MIA. The methodincludes selecting a compositional technique defining a composition toapply for image frames captured with an image sensor of the imagingsystem, detecting a movement of the target, calculating an MIAtrajectory point to achieve the composition for image frames predictedto be captured with the image sensor based on the movement of thetarget, moving the MIA to the calculated trajectory point, and capturingone or more image frames with the imaging system at the calculatedtrajectory point.

According to another implementation, a method is provided for tracking atarget with an imaging system forming part of an MIA that includesspecifying a constraint on movement that limits motion of the MIArelative to a frame of reference (FOR) that is the target or a fixedglobal positioning satellite system frame. The method also includesmoving the MIA in accordance with the specified constraints whilecapturing image frames with an image sensor of the imaging system. Inthe method, the specifying of the constraint on movement includesreceiving a voice command signal that is an audio signal or a digitalreproduction of the audio signal, performing a speech-to-text conversionon the received voice command signal to produce converted text,searching a command database containing valid commands using theconverted text to find a matching valid command that matches theconverted text, and determining the constraint on movement based on thematching valid command.

According to another implementation, a method is provided fordetermining a distance between an MIA and a moving target being trackedby an imaging device of the MIA, including analyzing signals ofultra-wide-band transceivers (UWBTs) distributed between the MIA and themoving target, each of the UWBTs being affixed to one of the MIA and themoving target, determining a distance between the MIA and the movingtarget based on the analyzed signals, and providing the determineddistance to a tracking system that is utilized by the MIA to track themoving target.

According to another implementation, a method is provided for tracking asubject with an imaging system forming part of an MIA. The methodincludes capturing a first image frame using an imaging sensor of theimaging system, transferring the first image frame to an external devicethat is connected to the MIA, locating the subject within thetransferred first image frame at a first set of frame coordinates,displaying the transferred first image frame on an external display ofthe external device, and displaying a bounding box around the subject inthe transferred first image frame on the external display. The methodfurther includes capturing a second image frame using the imagingsensor, transferring the second image frame to the external device,locating the subject within the transferred second image frame at asecond set of frame coordinates, displaying the transferred second imageframe on the external display, and displaying a bounding box around thesubject in the transferred second image frame on the external display.The method further includes capturing a third image frame using theimaging sensor, transferring the third image frame to the externaldevice, and determining a third set of frame coordinates at which thesubject is predicted to be using a motion model and based on the firstframe coordinates and the second frame coordinates. Finally, the methodfurther includes displaying a bounding box at a position related to thethird set of frame coordinates on the external display.

A method for tracking a subject in successive image frames includesobtaining previous image frames with an imaging device, processing theprevious image frames, obtaining motion information of the imagingdevice and a subject, determining a region of interest, obtaining asubsequent image frame, and processing the region of interest. Theprocessing includes determining previous frame positions of the subjecttherein. The motion information is obtained with sensors physicallyassociated with one or more of the imaging device and the subject. Theregion of interest is located in a predetermined spatial relationshiprelative to a predicted frame position of the subject.

A method for tracking a subject in successive image frames includesdetermining a predicted frame location of a subject, determining aregion of interest, obtaining a subsequent image frame, and processingthe region of interest to locate the subject. The predicted framelocation is a location at which the subject is estimated to appear in asubsequent image frame to be obtained at a subsequent time. Thedetermining of the region of interest includes determining the locationof the region of interest to be in a predetermined spatial relationshiprelative to the predicted frame location. The obtaining of thesubsequent image frame is performed at a subsequent time with an imagingdevice.

A movable imaging system includes a movable platform, an imaging device,and a tracking system. The movable platform is movable in real space.The imaging device is for capturing successive image frames that form avideo, and is connected to the movable platform. The tracking system isfor tracking a subject in the successive image frames. The trackingsystem locates a region of interest for a subsequent image frame at apredicted frame location of the subject in a future image frame. Thepredicted frame location is based on previous frame positions of thesubject in the successive images, motion information of the imagingdevice, and motion information of the subject. The tracking systemprocesses the region of interest of the future image frame to locate thesubject in the future image frame.

In an implementation, a method is provided for controlling a movableimaging assembly having a movable platform and an imaging device coupledto and movable relative to the movable platform. The method includesreceiving user inputs that define an MIA position relative to a targetand a frame position of the target within image frames captured by theimaging device. The user inputs include a horizontal distance, acircumferential position, and a horizontal distance that define the MIAposition, and include a horizontal frame position and a vertical frameposition that define the frame position. The method further includespredicting a future position of the target for a future time, and movingthe MIA to be in the MIA position at the future time and moving theimaging device for the target to be in the frame position for an imageframe captured at the future time.

In an implementation, a method is provided for controlling a movableimaging assembly having a movable platform and an imaging device coupledto and movable relative to the movable platform. The method includesreceiving user inputs that define an MIA position relative to a targetand a frame position of the target within image frames captured by theimaging device. The method further includes predicting a future positionof the target for a future time, and moving the MIA to be in the MIAposition at the future time and moving the imaging device for the targetto be in the frame position for an image frame captured at the futuretime.

In an implementation, a method is provided for controlling a movableimaging assembly having a movable platform and an imaging device coupledto and movable relative to the movable platform. The method includespredicting a future zone position at a future time of one or morerestricted zones defined relative to a target and in which the MIA isrestricted from intruding. The method also includes predicting whetherintended flight instructions will result in the MIA intruding the one ormore restricted zones at the future time. The method also includescontrolling the MIA according to the intended flight instructions if theMIA is predicted to not intrude the one or more restricted zones withthe intended flight instructions, or controlling the MIA according tomodified flight instructions if the MIA is predicted to intrude the oneor more restricted zones with the intended flight instructions.

These and other objects, features, and characteristics of the systemand/or method disclosed herein, as well as the methods of operation andfunctions of the related elements of structure and the combination ofparts and economies of manufacture, will become more apparent uponconsideration of the following description and the appended claims withreference to the accompanying drawings, all of which form a part of thisspecification, wherein like reference numerals designate correspondingparts in the various figures. It is to be expressly understood, however,that the drawings are for the purpose of illustration and descriptiononly and are not intended as a definition of the limits of thedisclosure. As used in the specification and in the claims, the singularform of “a,” “an,” and “the” include plural referents unless the contextclearly dictates otherwise.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a movable imaging system and high-levelcomponents according to various implementations of this disclosure.

FIG. 2A is a pictorial illustration of the MIA according to animplementation.

FIG. 2B is a pictorial illustration of the imaging device according toan implementation.

FIG. 2C is a pictorial illustration of an MIA controller and userinterface according to an implementation.

FIG. 2D is a pictorial illustration of the imaging device of FIG. 2Bwithin a movement mechanism.

FIG. 3 is a block diagram illustrating components of an imaging deviceaccording to an implementation.

FIG. 4A is a block diagram of a tracking system.

FIG. 4B is a is a flowchart of a technique for tracking a subject invideo image frames, which may be implemented by the tracking system ofFIG. 4A.

FIG. 5A is a flowchart of a technique for determining a region ofinterest, which may be used in the technique of FIG. 4.

FIGS. 5B-5C are pictorial representations of video image frames thatillustrate subject tracking with the technique of FIG. 5A.

FIG. 6A is a flowchart of another technique for determining a region ofinterest, which may be used in the technique of FIG. 4.

FIGS. 6B-6E are pictorial representations of video image frames thatillustrate subject tracking with the technique of FIG. 6A.

FIGS. 7A and 7B are pictorial illustrations of an imaging devicepositioned with respect to a target.

FIG. 7C is a block diagram of an implementation of a tracking system.

FIG. 7D is a flow diagram of a method implemented by the tracking systemof FIG. 7C.

FIG. 7E is a pictorial perspective view of the MIA of FIG. 2A operatingwithin predefined volumes.

FIG. 7F is a block diagram of an implementation of another trackingsystem.

FIG. 7G is a flow diagram of a method implemented by the tracking systemof FIG. 7F.

FIG. 8 is a pictorial representation of a video image frame thatillustrates an application of the rule of thirds.

FIG. 9A is a block diagram of an implementation of a voice recognitionsystem that may interact with a tracking system.

FIG. 9B is a block diagram of an implementation of a voice-controlledtracking system.

FIG. 9C is a flow diagram of a method implemented by the tracking systemof FIG. 9B.

FIG. 10 is a pictorial diagram of a target T comprising a plurality ofselectable subjects.

FIG. 11A is a pictorial representation of an MIA, such as the MIA ofFIG. 2A, tracking a target using ultra-wide-band transceivers.

FIG. 11B is a block diagram of an implementation of another trackingsystem.

FIG. 11C is a flow diagram of a method implemented by the trackingsystem of FIG. 11B.

FIG. 12A is a block diagram of various modules of a tracking imagingsystem having an un-optimized display system, according to animplementation.

FIG. 12B is a block diagram of an alternative display system for use inthe tracking imaging system of FIG. 12A.

FIG. 12C is a block diagram of another alternative display system foruse in the tracking imaging system of FIG. 12A.

FIG. 12D is a sequence of display images on a display device provided bythe tracking imaging system of FIG. 12A.

FIG. 12E is a flow diagram of a method implemented by the trackingimaging systems of FIGS. 12A-12C.

FIGS. 13-21 are block diagrams illustrating various architectureconfigurations for implementing certain functions of the movable imagingsystem.

All original Figures disclosed herein are © Copyright 2018 GoPro Inc.All rights reserved.

DETAILED DESCRIPTION

Implementations of the present technology will now be described indetail with reference to the drawings, which are provided asillustrative examples to enable those skilled in the art to practice thetechnology. The figures and examples below are not meant to limit thescope of the present disclosure to a single implementation orembodiment, but other implementations and embodiments are possible byway of interchange of or combination with some or all of the describedor illustrated elements. Wherever convenient, the same reference numberswill be used throughout the drawings to refer to same or like parts.

FIG. 1 is a block diagram of a movable imaging system 10 and high-levelcomponents according to various implementations of this disclosure. Themovable imaging system 10 may have two primary components: a movableimaging assembly or MIA 20 and an external device 50, such as an MIAcontroller with a user interface. These components may becommunicatively connected via a link 55. The link 55 may be wireless orwired. Other components may also be included within the movable imagingsystem 10. For example, the MIA 20 may comprise an imaging device 100,such as a camera (as used herein, the term “camera” is defined broadlyto include any form of imaging device) that can be used to capture stilland video images. The MIA 20 may include a movable platform 40 that canbe moved positionally and/or rotationally with respect to a fixedreference ground. The MIA 20 may also include an imaging device movementmechanism 30 that allows the imaging device 100 to move positionallyand/or rotationally with respect to the movable platform 40.

In some implementations, the external device 50 may correspond to asmartphone, a tablet computer, a phablet, a smart watch, a portablecomputer, and/or another device configured to receive user input andcommunicate information with the imaging device 100, imaging devicemovement mechanism 30, and/or movable platform 40 individually, or withthe MIA 20 as a whole.

In one or more implementations, the link 55 may utilize any wirelessinterface configuration, e.g., WiFi, Bluetooth (BT), cellular data link,ZigBee, near field communications (NFC) link, e.g., using ISO/IEC 14443protocol, ANT+ link, and/or other wireless communications link. In someimplementations, the link 55 may be effectuated using a wired interface,e.g., HDMI, USB, digital video interface, display port interface (e.g.,digital display interface developed by the Video Electronics StandardsAssociation (VESA), Ethernet, Thunderbolt), and/or other interface.

The UI of the external device 50 may operate a software application(e.g., GoPro Studio®, GoPro App®, and/or other application) configuredto perform a variety of operations related to camera configuration,control of video acquisition, and/or display of video captured by theimaging device 100. An application (e.g., GoPro App)® may enable a userto create short video clips and share video clips to a cloud service(e.g., Instagram®, Facebook®, YouTube®, Dropbox®); perform full remotecontrol of imaging device 100 functions; live preview video beingcaptured for shot framing; mark key moments while recording (e.g.,HiLight Tag®, View HiLight Tags in GoPro Camera Roll®) for locationand/or playback of video highlights; wirelessly control camera software;and/or perform other functions. Various methodologies may be utilizedfor configuring the imaging device 100 and/or displaying the capturedinformation.

By way of an illustration, the UI of the external device 50 may receivea user setting characterizing image resolution (e.g., 3840 pixels by2160 pixels), frame rate (e.g., 60 frames per second (fps)), and/orother settings (e.g., location) related to an activity (e.g., mountainbiking) being captured by the user. The UI of the external device 50 maycommunicate these settings to the imaging device 100 via the link 55.

A user may utilize the UI of the external device 50 to view contentacquired by the imaging device 100. A display of the UI of the externaldevice 50 may act as a viewport into a 3D space of the content. In someimplementations, the UI of the external device 50 may communicateadditional information (e.g., metadata) to the imaging device 100. Byway of an illustration, the UI of the external device 50 may provideorientation of the UI of the external device 50 with respect to a givencoordinate system to the imaging device 100 to enable determination of aviewport location or dimensions for viewing of a portion of thepanoramic content, or both. By way of an illustration, a user may rotate(sweep) the UI of the external device 50 through an arc in space. The UIof the external device 50 may communicate display orientationinformation to the imaging device 100 using a communication interfacesuch as link 55. The imaging device 100 may provide an encoded bitstreamconfigured to enable viewing of a portion of the content correspondingto a portion of the environment of the display location as the imagingdevice 100 traverses the path. Accordingly, display orientationinformation sent from the UI of the external device 50 to the imagingdevice 100 allows user selectable viewing of captured image and/orvideo.

In many instances, it is desirable to track a target (which may includeone or more subjects) with the MIA 20. Various forms of tracking may beutilized, including those discussed below and in U.S. Provisional PatentApplication Ser. No. 62/364,960, filed Jul. 21, 2016, and hereinincorporated by reference in its entirety. A tracking system 60 may beutilized to implement the described forms of tracking. The trackingsystem 60 may comprise a processor and algorithms that are used fortracking the target. The tracking system 60 is shown in dashed linessince it may be included entirely within the MIA 20 or entirely withinthe external device 50, or portions of the tracking system 60 may belocated or duplicated within each of the MIA 20 and the external device50. The tracking system 60 may control the MIA 20, the imaging devicemovement mechanism 30, and/or the imaging device 100 to locate a subjectS within successive image frames and/or to physically move the MIA 20and/or the imaging device 100 to maintain the subject S within a fieldof view of the imaging device 100, even as the subject S moves in realspace and/or relative to the MIA 20. A voice recognition system 70 mayalso be utilized to interact with the tracking system 60. The voicerecognition system 70 is defined in more detail below.

FIGS. 2A-2D are pictorial illustrations of implementations of thecomponents shown in FIG. 1. FIG. 2A is a pictorial illustration of theMIA 20 according to an implementation. In the implementation shown, theMIA 20 includes a movable platform 40 that is a quadcopter drone, butthe invention is not limited to this implementation. The MIA 20 could beany form of an aerial vehicle or any form of movable device that ismovable with respect to a fixed ground, which could include movablemechanical systems that are tied to the earth. As shown in FIG. 2A, theimaging device 100 is fixedly mounted in the front of the movableplatform 40 so that it points in a direction along an axis of themovable platform 40. However, in various implementations, the mountingof the imaging device 100 to the movable platform 40 is done using theimaging device movement mechanism 30.

FIG. 2B is a pictorial illustration of the imaging device 100. In FIG.2B, the imaging device 100 is a GoPro Hero4® camera, however any type ofimaging device 100 may be utilized. The imaging device 100 may include avideo camera device. FIG. 2B also shows a lens 130 of the camera, alongwith a display 147 (e.g., display screen).

FIG. 2C is a pictorial illustration of an external device 50,specifically, an MIA controller and user interface according to animplementation. The user interface may further comprise a display system51 with a display device 52. The MIA controller may further comprise acommunications interface via which it may receive commands both foroperation of the movable platform 40, such as the UAV or drone, andoperation of the imaging device 100. The commands can include movementcommands, configuration commands, and other types of operational controlcommands.

FIG. 2D is a pictorial illustration of the imaging device 100 within theimaging device movement mechanism 30. The imaging device movementmechanism 30 couples the imaging device 100 to the movable platform 40.The implementation of the imaging device movement mechanism 30 shown inFIG. 2D is a three-axis gimbal mechanism that permits the imaging device100 to be rotated about three independent axes. However, the imagingdevice movement mechanism 30 may include any type of translationaland/or rotational elements that permit rotational and/or translationalmovement in one, two, or three dimensions.

As illustrated in FIG. 3, which is a block diagram illustratingcomponents of an imaging device 100 according to an implementation, theimaging device 100 may include a processor 132 which controls operationof the imaging device 100. In some implementations, the processor 132may include a system on a chip (SOC), microcontroller, microprocessor,CPU, DSP, ASIC, GPU, and/or other processors that control the operationand functionality of the imaging device 100. The processor 132 mayinterface with mechanical, electrical, sensory, or power modules and/ora UI module 146 via driver interfaces and/or software abstractionlayers. Additional processing and memory capacity may be used to supportthese processes. These components may be fully controlled by theprocessor 132. In some implementation, one or more components may beoperable by one or more other control processes (e.g., a GPS receivermay include a processing apparatus configured to provide position and/ormotion information to the processor 132 in accordance with a givenschedule (e.g., values of latitude, longitude, and elevation at 10 Hz)).

The imaging device 100 may also include image optics 134 (e.g., opticsmodule), which may include the lens 130 as an optical element of theimaging device 100. In some implementations, the lens 130 may be afisheye lens that produces images having a fisheye (or near-fisheye)field of view (FOV). Other types of image optics 134 may also beutilized, such as, by way of non-limiting example, one or more of astandard lens, macro lens, zoom lens, special-purpose lens, telephotolens, prime lens, achromatic lens, apochromatic lens, process lens,wide-angle lens, ultra-wide-angle lens, fisheye lens, infrared lens,ultraviolet lens, perspective control lens, other lens, and/or otheroptical element. In some implementations, the optics module 134 mayimplement focus controller functionality configured to control theoperation and configuration of the camera lens. The optics module 134may receive light from an object and couple received light to an imagesensor 136, discussed below.

The imaging device 100 may include one or more image sensors 136including, by way of non-limiting examples, one or more of acharge-coupled device (CCD) sensor, active pixel sensor (APS),complementary metal-oxide semiconductor (CMOS) sensor, N-typemetal-oxide-semiconductor (NMOS) sensor, and/or other image sensor. Theimage sensor 136 may be configured to capture light waves gathered bythe optics module 134 and to produce image(s) data based on controlsignals from a sensor controller 140, discussed below. The image sensor136 may be configured to generate a first output signal conveying firstvisual information regarding an object. The visual information mayinclude, by way of non-limiting example, one or more of an image, avideo, and/or other visual information. The optics module 134 and theimage sensor 136 may be embodied in a housing.

The imaging device may further include an electronic storage 138 (e.g.,an electronic storage element) in which configuration parameters, imagedata, code for functional algorithms and the like may be stored. In someimplementations, the electronic storage 138 may include a system memorymodule that is configured to store executable computer instructionsthat, when executed by the processor 132, perform various camerafunctionalities including those described herein. The electronic storage138 may include storage memory configured to store content (e.g.,metadata, images, audio) captured by the imaging device 100.

The electronic storage 138 may include non-transitory memory configuredto store configuration information and/or processing code configured toenable, e.g., video information and metadata capture, and/or to producea multimedia stream comprised of, e.g., a video track and metadata inaccordance with the methodologies of the present disclosure. In one ormore implementations, the processing configuration may include capturetype (video, still images), image resolution, frame rate, burst setting,white balance, recording configuration (e.g., loop mode), audio trackconfiguration, and/or other parameters that may be associated withaudio, video, and/or metadata capture. Additional memory may beavailable for other hardware/firmware/software needs of the imagingdevice 100. The memory and processing capacity may aid in management ofprocessing configuration (e.g., loading, replacement), operations duringa startup, and/or other operations. Consistent with the presentdisclosure, the various components of the imaging device 100 may beremotely disposed from one another and/or aggregated. For example, oneor more sensor components may be disposed distal from the imaging device100. Multiple mechanical, sensory, or electrical units may be controlledby a learning apparatus via network/radio connectivity.

The processor 132 may interface to the sensor controller 140 in order toobtain and process sensory information for, e.g., object detection, facetracking, stereo vision, and/or other tasks.

The processor 132 may also interface one or more metadata sources 144(e.g., metadata module). The metadata sources 144, in more detail, mayinclude sensors such as an inertial measurement unit (IMU) including oneor more accelerometers and/or gyroscopes, a magnetometer, a compass, aglobal positioning satellite (GPS) sensor, an altimeter, an ambientlight sensor, a temperature sensor, a pressure sensor, a heart ratesensor, a depth sensor (such as radar, an infra-red-based depth sensor,such as a Kinect-style depth sensor, and a stereo depth sensor) and/orother sensors. The imaging device 100 may contain one or more othermetadata/telemetry sources, e.g., image sensor parameters, batterymonitor, storage parameters, and/or other information related to cameraoperation and/or capture of content. The metadata sources 144 may obtaininformation related to environment of the imaging device 100 and aspectsin which the content is captured.

By way of a non-limiting example, the accelerometer may provide devicemotion information including acceleration vectors representative ofmotion of the imaging device 100, from which velocity vectors may bederived. The gyroscope may provide orientation information describingthe orientation of the imaging device 100, the GPS sensor may provideGPS coordinates, time, and identifying location of the imaging device100, and the altimeter may obtain the altitude of the imaging device100. In some implementations, the metadata sources 144 may be rigidlycoupled to the imaging device 100 such that any motion, orientation, orchange in location of the imaging device 100 also occurs for themetadata sources 144.

The sensor controller 140 and/or the processor 132 may be operable tosynchronize various types of information received from the metadatasources 144. For example, timing information may be associated with thesensor data. Using the timing information, metadata information may berelated to content (photo/video) captured by the image sensor 136. Insome implementations, the metadata capture may be decoupled from thevideo/image capture. That is, metadata may be stored before, after, andin-between one or more video clips and/or images. In one or moreimplementations, the sensor controller 140 and/or the processor 132 mayperform operations on the received metadata to generate additionalmetadata information. For example, the processor 132 may integrate thereceived acceleration information to determine a velocity profile of theimaging device 100 during a recording of a video. In someimplementations, video information may consist of multiple frames ofpixels using any applicable encoding method (e.g., H.262, H.264,Cineform, and/or other codec). In some implementations, the imagingdevice 100 may include, without limitation, video, audio, capacitive,radio, vibrational, ultrasonic, infrared, radar, LIDAR and/or sonar,and/or other sensory devices.

The imaging device 100 may include audio devices 145, such as one ormore microphones configured to provide audio information that may beassociated with images acquired by the image sensor 136. Two or moremicrophones may be combined to form a microphone system that isdirectional. Such a directional microphone system can be used todetermine the direction or location of a sound source and/or toeliminate undesirable noise originating in a particular direction.Various audio filters may be applied as well. The sensor controller 140may receive image and/or video input from the image sensor 136 and audioinformation from the audio devices 145. In some implementations, audioinformation may be encoded using, e.g., AAC, AC3, MP3, linear PCM,MPEG-H, and/or other audio coding formats (audio codec). In one or moreimplementations of spherical video and/or audio, the audio codec mayinclude a 3-dimensional audio codec. For example, an Ambisonics codeccan produce full surround audio including a height dimension. Using aG-format Ambionics codec, a special decoder may not be required.

In some implementations, one or more external metadata devices (notshown) may interface to the imaging device 100 via a wired link (notshown), e.g., HDMI, USB, coaxial audio, and/or other interface. Themetadata obtained by the imaging device 100 may be incorporated into thecombined multimedia stream using any applicable known methodologies.

The imaging device 100 may include its own display (e.g., display 147shown in FIG. 2B) as a part of its UI 146 (e.g., UI module). The displaymay be configured to provide information related to camera operationmode (e.g., image resolution, frame rate, capture mode, sensor mode,video mode, photo mode), connection status (connected, wireless, wiredconnection), power mode (e.g., standby, sensor mode, video mode),information related to metadata sources (e.g., heart rate, GPS), and/orother information. The UI 146 may include other components (e.g., one ormore buttons) configured to enable the user to start, stop, pause,and/or resume sensor and/or content capture. User commands may beencoded using a variety of approaches, including but not limited toduration of button press (pulse width modulation), number of buttonpresses (pulse code modulation), or a combination thereof. By way of anillustration, two short button presses may initiate sensor acquisitionmode, and a single short button press may be used to communicate (i)initiation of video or photo capture and cessation of video or photocapture (toggle mode) or (ii) video or photo capture for a given timeduration or number of frames (burst capture). Other user command orcommunication implementations may also be realized, e.g., one or moreshort or long button presses.

In some implementations, the UI 146 may include virtually various typesof device capable of registering inputs from and/or communicatingoutputs to a user. These may include, without limitation, display,touch, proximity sensitive interface, light, sound receiving/emittingdevices, wired/wireless input devices and/or other devices. The UImodule 146 may include a display, one or more tactile elements (e.g.,buttons and/or virtual touch screen buttons), lights (LED), speaker,and/or other UI elements. The UI module 146 may be operable to receiveuser input and/or provide information to a user related to operation ofthe imaging device 100. The imaging device 100 may further include, insome implementations, an input/output or I/O module 148. The I/O module148 may be configured to synchronize the imaging device 100 with othercameras and/or with other external devices, such as a remote control, asecond capture device, a smartphone, the UI of the external device 50 ofFIG. 1A, and/or a video server. The I/O module 148 may be configured tocommunicate information to/from various I/O components. In someimplementations the I/O module 148 may include a wired and/or wirelesscommunications interface (e.g., Wi-Fi, Bluetooth, USB, HDMI, WirelessUSB, Near Field Communication (NFC), Ethernet, a radio frequencytransceiver, and/or other interfaces) configured to communicate to oneor more external devices (e.g., UI of the external device 50 in FIG. 1and/or another metadata source). In some implementations, the I/O module148 may interface with LED lights, a display, a button, a microphone,speakers, and/or other I/O components. In one or more implementations,the I/O module 148 may interface to an energy source, e.g., a battery,and/or a DC electrical source.

In some implementations, the I/O module 148 of the imaging device 100may include one or more connections to external computerized devices toallow for, among other things, configuration and/or management of remotedevices, e.g., as described above with respect to FIG. 1 and/or asdescribed below with respect to FIG. 3. The I/O module 148 may includeany of the wireless or wireline interfaces discussed above, and further,may include customized or proprietary connections for specificapplications.

In some implementations, a communication device 150 may be coupled tothe I/O module 148 and may include a component (e.g., a dongle) havingan infrared sensor, a radio frequency transceiver and antenna, anultrasonic transducer, and/or other communications interfaces used tosend and receive wireless communication signals. In someimplementations, the communication device 150 may include a local (e.g.,Bluetooth, Wi-Fi) and/or broad range (e.g., cellular LTE) communicationsinterface configured to enable communications between the imaging device100 and a remote device (e.g., the UI of the external device 50 in FIG.1). The communication device 150 may employ communication technologiesincluding one or more of Ethernet, 802.11, worldwide interoperabilityfor microwave access (WiMAX), 3G, Long Term Evolution (LTE), digitalsubscriber line (DSL), asynchronous transfer mode (ATM), InfiniBand, PCIExpress Advanced Switching, and/or other communication technologies. Byway of non-limiting example, the communication device 150 may employnetworking protocols including one or more of multiprotocol labelswitching (MPLS), transmission control protocol/Internet protocol(TCP/IP), User Datagram Protocol (UDP), hypertext transport protocol(HTTP), simple mail transfer protocol (SMTP), file transfer protocol(FTP), and/or other networking protocols.

Information exchanged over the communication device 150 may berepresented using formats including one or more of hypertext markuplanguage (HTML), extensible markup language (XML), and/or other formats.One or more exchanges of information between the imaging device 100 andoutside devices may be encrypted using encryption technologies includingone or more of secure sockets layer (SSL), transport layer security(TLS), virtual private networks (VPNs), Internet Protocol security(IPsec), and/or other encryption technologies.

The imaging device 100 may include a power system 152 tailored to theneeds of the applications of the imaging device 100. For example, for asmall-sized, lower-power action camera having a wireless power solution(e.g. battery, solar cell, inductive (contactless) power source,rectification, and/or other power supply) may be used.

Location Prediction for Subject Tracking

Referring to FIGS. 4A-4B, a tracking system 300 and a method ortechnique 400 are provided tracking a subject S in successive imageframes obtained by the imaging device 100 (e.g., video). The trackingsystem 300 may be implemented wholly or partially by the tracking system60. It may be desirable in many circumstances to track a particularsubject when recording a video, such as by locating the subject insuccessive image frames of the video (e.g., identifying and determiningframe positions of the subject), for example, to control the imagingdevice 100 and/or MIA 20 to ensure that the subject S remains in theimage frames. Subject tracking may be difficult, for example, withsimultaneous movement of the subject and the imaging device 100 and/orby taking significant time and/or consuming significant computingresources when large amounts of video data are capture (e.g., highresolution image frames, such as 4k).

Rather than process (e.g., search) an entire image frame to locate(e.g., identify and/or determine a position of) the subject S therein,the technique 400 determines a region of interest (ROI) of the imageframe to be processed. The ROI is a portion (e.g., window) of the imageframe, which is smaller than the entire image frame and thereby requiresless time and/or less computing resources to be processed than theentire image frame.

As shown in FIG. 4A, the tracking system 300 includes various modulesperformed by various hardware components to implement the technique 400,and may also include or be in communication with various sensorsassociated with the imaging device 100 and/or the subject S. Thetracking system 300 and its various modules are introduced below at ahigh level with further description of the techniques implementedthereby discussed in still further detail below.

The modules may be included in and/or operated by various components ofthe movable imaging system 10 (e.g., the MIA 20, the imaging device 100,the external device 50, the tracking system 60, etc.). For example, thetracking system 300 includes a module 310 (e.g., an ROI module) fordetermining the ROI for a particular image frame, a module 320 (e.g., animage capture module) for obtaining the imaging frame, and a module 330(e.g., an image processing module) for processing the image frame, suchas the ROI of the image frame. The tracking system 300 may also includea module 350 (e.g., a tracking control module) for controlling theimaging device 100 and/or the MIA 20.

The ROI module 310 includes a module 312 (e.g., an estimation module, orvisual motion estimation module) for determining a visual motionestimate (e.g., a visual motion estimation module), a module 313 fordetermining an imaging device motion estimate (e.g., an imaging devicemotion estimation module, and/or a module 314 for determining a subjectmotion estimate (e.g., a subject motion estimation module), along with amodule 315 for determining a combined motion estimate (e.g., a combinedmotion estimation module), and a module 316 for determining the ROI(e.g., an ROI determination module). The ROI module 310 may furtherinclude a module 317 for determining relative motion between the subjectS and the imaging device (e.g., a relative motion estimation module).Various of the modules may be omitted in accordance with the technique400 and variations thereof described below.

The visual motion estimation module 312 may receive visual informationfrom the imaging processing module 330, such as previous positions ofthe subject S in previously captured image frames, from which the visualmotion estimate is determined.

The imaging device motion estimation module 313 may receive motioninformation of the imaging device 100, or other components of the MIA20, such as the movable platform 40 and/or the imaging device movementmechanism 30, with motion sensors 313 a physically associated therewith.The motion sensors 313 a associated with the imaging device 100 mayinclude the metadata sources 144. The imaging device motion estimate isdetermined from information received from the motion sensors 313 a, asdiscussed in further detail below.

The subject device motion estimation module 314 may receive motioninformation of the subject S with motion sensors 314 a physicallyassociated therewith. For example, the motion sensors 314 a may besensors of the external device 50 being held or attached to the subjectS. The subject device motion estimate is determined from the informationreceived from the sensors 314 a.

The relative motion estimation module 317 may, if included, receivevisual information and/or motion information from the estimation modules312, 313, 314 and/or the sensors 313 a, 314 a.

The combined motion estimation module 315 receives the estimates fromthe estimation modules 312, 313, 314, 317 from which the combined motionestimate is determined.

The ROI determination module 316 receives the combined motion estimatefrom which the size and/or position of the ROI is determined.

As shown in the flowchart of FIG. 4B, the technique 400, which may beimplemented by the subject tracking system 300, generally includesoperations of determining 410 the ROI for an image frame IF_(t)corresponding to a time t, obtaining 420 the image frame IF_(t) at thetime t, and processing 430 the ROI of the image frame to locate asubject S within the image frame IF_(t), which may also includedetermining a size of the subject S in the image frame IF_(t). Thetechnique 400 may further include repeating 440 the determining 410, theobtaining 420, and the processing 430 for still further image framesIF_(t+1), IF_(t+2), . . . IF_(t+n) to be obtained at subsequent timest+1, t+2, . . . t+n. The technique 400 may also include controlling 450the imaging device 100 and/or the MIA 20 to track the subject S, forexample, to maintain the subject S in subsequent image frames. Forexample, the controlling 450 may include controlling the location and/ororientation of the movable platform 40 (e.g., using output devices, suchas a rotor), the location and/or orientation of the imaging device 100with respect to the movable platform 40 (e.g., by operating the imagingdevice movement mechanism 30), and/or by controlling the imaging device100 (e.g., with a zoom function).

The image frame for which the ROI is determined may be referred to as asubsequent image frame or a future image frame. The determining 410 ofthe ROI may be performed in various manners described below, forexample, by the ROI module 310. The obtaining 420 of the image frame isperformed, for example, by the image capture module 320 with the imagingdevice 100, which may be part of the MIA 20, by capturing the imageframe as discussed above. The processing 430 of the ROI_(t) is performedfor the image frame IF_(t), for example, by the image processing module330 with the imaging device 100, the MIA 20, the external device 50,and/or the tracking system 60 according to any suitable technique todetermine the frame position S_(POSt) in the image frame IF_(t), such asby determining a centroid of the subject S.

The determining 410 of the ROI may be performed in various manners andmay include determining a position of the ROI for the image frame andmay further include determining a size of the ROI. For example, and asdiscussed in further detail below, ROI may be determined for a futureimage frame according to previous positions of the subject S withinpreviously obtained image frames, motion of the imaging device 100,motion of the subject S, relative motion between the imaging device 100and the subject S, or combinations thereof. Furthermore, the position ofthe ROI may be based on a position in which the subject S is predictedto be in the subsequent image frame. As used herein, the terms “frameposition” or “subject frame position” refer to the position of thesubject S in an image frame, which may include positions at which thesubject S has been determined to be located in obtained image frames andmay also include a position at which the subject S is located in anobtained image frame that has yet to be processed for locating thesubject S therein.

Referring to FIGS. 5A-5C, the ROI for a future image frame may belocated relative to the frame position of the subject S in a previousframe. FIG. 5A is a flowchart of a technique 510 for determining theROI, while FIGS. 5B-5C illustrate the technique 510 visually. Thetechnique 510 presumes close proximity of the subject S in successiveimage frames and does not predict or estimate specific future locationsat which the subject S might appear in a future image frames. Thetechnique 510 may, for example, be implemented by the ROI module 310,including the visual motion estimation module 312 and the ROIdetermination module 316.

The technique 510 may be used to perform the operation for thedetermining 410 of the ROI in the technique 400. The technique 510includes operations of obtaining 512 a first image frame IF_(t−1) at atime t−1 (See FIG. 5B), processing 514 a first image frame IF_(t−1) (oran ROI thereof) to determine a frame position S_(POSt−1) of the subjectS in the first frame IF_(t−1) (see FIG. 5B), and locating 516 theROI_(t) for a second image frame IF_(t) in a predetermined spatialrelationship relative to the first frame position S_(POSt−1) (see FIG.5C). The technique 510 may be repeated as part of the technique 400 forsubsequent image frames IF_(t+1), IF_(t+2), . . . IF_(t+n). The firstimage frame IF_(t−1) may also be referred to as a prior or previousimage frame, while the second image frame IF_(t) may be referred to as asubsequent or future image frame or a successive image frame (e.g.,being obtained immediately subsequent to the first image frame IF_(t−1),for example, in a video stream obtained by the imaging device 100 at aframe rate, such as 30 fps).

The obtaining 512 of the first image frame IF_(t−1) may be the obtaining420 performed in the technique 400 for an image frame from prior to theimage frame IF_(t). The processing 514 may be for an entirety of theimage frame IF_(t−1), or may be for an ROI thereof (e.g., as determinedin a prior operation of the technique 510). The locating 516 of theROI_(t) may include centering the ROI_(t) on the frame positionS_(POSt−1) of the subject S in the first frame IF_(t−1). The ROI_(t)may, for example, be rectangular as shown (e.g., having a common aspectratio with the entire image frame), square, or another suitable shape.

The technique 510 may also include determining a size of the ROI_(t).For example, the size of the ROI_(t) may be determined according to asize of the subject S, for example, in the image frame IF_(t−1), forexample, increasing or decreasing in size if the subject S appears inthe image frame IF_(t−1) larger or smaller as compared to a previousimage frame. For example, the size of the ROI_(t) may be determinedaccording to a predicted size of the subject S in the image frameIF_(t). Alternatively, the size of the ROI may be a default size or maybe fixed as the technique 510 is performed for successive image frames.Generally speaking, a larger ROI_(t) results in a higher likelihood ofthe subject S being within the image frame IF_(t), while a smallerROI_(t) results in a lesser likelihood.

Referring to FIGS. 6A-6E a technique 610 and variations thereof areprovided for determining the ROI (i.e., the size and the location)relative to a predicted frame position of the subject S in the futureimage frame. Such techniques may be performed with various differentinformation and/or in various different manners. Such information mayinclude visual information obtained from previously obtained imageframes, motion information of the imaging device 100, and/or motion ofthe subject S, which may be obtained from the previously obtained imagesand/or various sensors associated therewith. The term “predicted frameposition” or “predicted subject frame position” refers to the positionat which the subject S is estimated (e.g., predicted, estimated, likely,etc.) to appear in the subsequent image frame. In some implementations,the technique 400 may include initially performing the technique 510 todetermine the ROI for one or more initial image frames (e.g., a secondimage frame in a video image stream), and include later performinganother technique (e.g., the technique 610) to determine the ROI forlater image frames (e.g., after sufficient visual and/or motion data isacquired to perform the technique 610). The technique 610 may beimplemented by the ROI module 310, including the visual, imaging device,subject, relative, and/or combined motions modules 312-315, 317 and theROI determination module 316.

FIG. 6A is a flowchart of a technique 610 for determining the ROI, whileFIGS. 6B-6E illustrate the technique 610 visually. The technique 610 maybe used to perform the operation for the determining 410 of the ROI_(t)in the technique 400. The technique 610 includes operations of:determining 620 a motion estimate of the subject S according topreviously obtained image frames (e.g., a visual motion estimate),determining 630 a motion estimate of the imaging device 100 in realspace (e.g., an imaging device motion estimate), and determining 640 amotion estimate of the subject S in real space (e.g., a subject motionestimate). The technique 612 further includes determining 650 a motionestimate of the subject S according to the one or more of the visualmotion estimate, imaging device motion estimate, and the subject motionestimate (e.g., a combined motion estimate), and determining 660 a sizeand location of the ROI_(t) from the combined motion estimate. The term“real space” refers to a fixed spatial frame of reference, which may beglobal coordinates or another defined coordinate system. The motionestimates may, for example, be estimates for a change of position of thesubject S in the image frames IF, or may be an estimate of motion of theimaging device 100 or the subject S from which estimates of the changesof position of the subject S may be derived.

The operation for the determining 620 of the visual motion estimate is,for example, performed by the visual motion estimation module 312according to a motion model. The visual motion estimate is an estimateof a change of position of the subject S in the image frame (e.g., achange in X, Y coordinates or predicted X, Y coordinates). The motionmodel uses the frame positions of the subject S in two or morepreviously obtained image frames IF_(t−m), . . . IF_(t−2), IF_(t−1) anda motion model to predict motion of the subject S, for example, from theimage frame IF_(t−1) to the image frame IF_(t). The determining 620generally includes operations of obtaining 622 the image framesIF_(t−m), . . . IF_(t−2), IF_(t−1) (see FIGS. 6B-6D), processing 624 theimage frames IF_(t−m), . . . IF_(t−2), IF_(t−1) to determine framepositions S_(t−m), . . . S_(t−2), S_(t−1) of the subject S therein (seeFIGS. 6B-6D), and determining 626 a visual motion estimate Δ_(x, y) ofthe subject S using the frame positions S_(t−m), . . . S_(t−2), S_(t−1)and a motion model (see FIG. 6E).

The motion model may, as illustrated in FIG. 6E, be a constant motionmodel that assumes constant motion of the subject S between the two mostrecent image frames (e., IF_(t−1) and IF_(t−2)) and between the mostrecent image frame and the subsequent image frame (e.g., IF_(t−1)). Forexample, the constant motion may be a two-dimensional frame positionchange Δ_(x, y), or may be a three-dimensional frame position changeΔ_(x, y, z) that additionally accounts for a distance in a directionperpendicular to the image frame (e.g., based on a change of size of thesubject S in the image frames or measured distances between the subjectS and the imaging device 100). Alternatively, the motion model may usemore than two frame positions from previously obtained image frames(e.g., three, four, or more), which may more accurately determine thevisual motion estimate by considering more information, for example,using line fitting (e.g., a linear motion model), curve fitting (e.g., acurvileinear motion model, for example, using polynomials and/orsplines), or a recursive filter (e.g., an extended Kalman filter (EKF)).

The determining 620 of the visual motion estimate may further includedetermining a confidence value associated therewith, which may bereferred to as a visual motion estimate confidence value. The confidencevalue is a measure of accuracy and/or certainty of visual motionestimate. The confidence value may be used in the determining 650 of thecombined motion estimate, for example, to weight and/or filter thevisual motion estimate among the imaging device motion estimate and thesubject motion estimate.

Instead or additionally, the visual motion estimate may be, or be basedon, relative motion of the imaging device 100 and the subject S asderived from the successive images. This may be referred to as arelative motion estimate, which may be determined by the relative motionestimation module 317. For example, direction and distance measurements(e.g., a vector) of the imaging device 100 and the subject S maycalculated from the frame positions of the subject S in previous imageframes and from a focal distance associated therewith (or other measureof distance between the subject S and the imaging device 100), andchanges therein. A motion model (e.g., line or curve fitting model) maybe applied to the previous direction and distance measurements topredict future relative motion of the imaging device 100 and the subjectS from which the visual motion estimate may be derived.

Instead or additionally, the visual motion may be based on motionvectors created during video processing (e.g., encoding and/orcompression techniques). When the image frames are encoded using certainvideo encoding techniques, such as H.264 (MPEG-4 Part 10, Advanced VideoCoding), the encoding utilizes motion vectors created by the videoencoder between the last and the current video image frames. Thesemotion vectors may be utilized to predict or refine the visual motionestimate.

The operation for the determining 630 of the imaging device motionestimate is, for example, performed by the subject motion estimationmodule 313 according to motion information of the imaging device 100.The imaging device motion estimate is an estimate of motion of theimaging device 100 in real space, for example, from time t−1 to t.Alternatively, the imaging device motion estimate may be an estimate ofmotion of the subject S between the image frame IF_(t−1) and the imageframe IF_(t) due to motion of the imaging device 100 in real space. Thedetermining 630 of the imaging device motion estimate generally includesoperations of obtaining 632 motion information of the imaging device100, and determining 634 the imaging device motion estimate from themotion information.

The motion information of the imaging device 100 may include orientationinformation and position information. The motion information may also bereferred to as egomotion. Orientation information may, for example,include roll, pitch, yaw, and higher order terms thereof, such asrotational velocity and/or rotational acceleration. Position informationmay, for example, include horizontal coordinates (e.g., globalpositioning or Euclidean coordinates), elevation, and higher order termsthereof, such as translational velocity and/or acceleration.

Orientation information and position information may be obtained fromthe various sensors 313 a physically associated with the imaging device100, such as the metadata sources 144. The various sensors may becoupled to the imaging device 100 itself, or may be coupled to othercomponents of the MIA 20, such as the movable platform 40 and theimaging device movement mechanism 30. In one example, the imaging device100 includes an embedded gyroscope, which includes one or moregyroscopes to detect rotation of the imaging device 100 in multiple axesrelative to real space (e.g., the roll, pitch, and yaw). In anotherexample, the MIA 20, or the movable platform 40 thereof, may include aglobal positioning system, a gyroscope, accelerometers, a barometer, acompass, an altimeter, a barometer, a magnetometer, an optical flowsensor, and/or an IMU (which may include one or more of theaforementioned sensors) from which the motion information (e.g.,orientation and/or position, or changes therein) of the movable platform40 may be determined in real space. The imaging device movementmechanism 30 may additionally include position sensors, which measurethe motion information (e.g., orientation and/or position, or changestherein) of the imaging device 100 relative to the movable platform 40.Thus, from motion information of the movable platform 40 and of theimaging device movement mechanism 30, motion information of the imagingdevice 100 may be determined.

Still further, motion information of the imaging device 100 in realspace may be obtained from the previously obtained image frames IF_(t−m). . . , IF_(t−2), IF_(t−1). For example, the position and/or orientationof the imaging device 100 (e.g., the MIA 20) may be obtained byobserving changes in the frame position and/or size of references pointsfixed in real space (e.g., features of the terrain which the subject Smay move relative to).

The determining 630 of the imaging device motion estimate may furtherinclude determining a confidence value associated therewith, which maybe referred to as an imaging device motion estimate confidence value.The confidence value is a measure of accuracy and/or certainty the ofimaging device motion estimate, which may, for example, be based on thereliability of the motion information (e.g., time delay and/or frequencyrelative to the time between successive image frames, accuracy of thesensors, availability and/or operation of the sensors, etc.). Theconfidence value may be used in the determining 650 of the combinedmotion estimate, for example, to weight and/or filter the subject motionestimate among the imaging device motion estimate and the subject motionestimate.

The operation for the determining 640 of the subject motion estimate is,for example, performed by the subject motion estimation module 314according to motion information of the subject S. The subject estimationis an estimate of motion of the subject S in real space and/or relativeto the imaging device 100, for example, from time t−1 to t.Alternatively, the subject motion estimate may be an estimate of motionof the subject S between the image frame IF_(t−1) and the image frameIF_(t) due to motion of the subject S in real space and/or relativemotion of the subject S to the imaging device 100. The determining 640of the subject motion estimate generally includes operations ofobtaining 642 motion information of the subject S, and determining 644the subject motion estimate from the motion information of the subjectS.

The motion information of the subject S may include positioninformation. The position information may, for example, includecoordinates (e.g., global positioning or Euclidean coordinates) and/orelevation of the subject S in real space, and higher order termsthereof, such as translational velocity and/or acceleration. Theposition information may instead or additionally include relativepositional information between the subject S and the imaging device 100,such as a distance therebetween and/or directional information (e.g., avector).

Position information may be obtained from various sensors 314 a and/ortransmitters physically associated with the subject S. For example, abeacon device, such as the external device 50, a smartphone,accelerometers, a dedicated beacon device, or the beacon schemadescribed below, may be carried by, coupled to, or otherwise physicallyassociated with the subject S. The sensors and/or transmitters may beused to determine the position, velocity, and/or acceleration of thesubject S in real space (e.g., as with a global positioning systemand/or accelerometers).

The determining 640 of the subject motion estimate may further includedetermining a confidence value associated therewith, which may bereferred to as subject motion estimate confidence value. The confidencevalue is a measure of accuracy and/or certainty of the subject motionestimate, which may, for example, be based on the reliability of themotion information (e.g., time delay and/or frequency relative to thetime between successive image frames, accuracy of the sensors, etc.).The confidence value may be used in the determining 650 of the combinedmotion estimate, for example, to weight and/or filter the subject motionestimate among the imaging device motion estimate and the subject motionestimate.

Instead or additionally, the subject motion estimate may be a measure ofrelative movement between the subject S and the imaging device 100. Thismay also be referred to as a relative motion estimate, which may bedetermined by the relative motion estimation module 317. For example,the imaging device 100, the MIA 20, and/or the subject S may includesensors 313 a, 314 a by which distance and direction may be measured.For example, the imaging device 100 and/or the MIA 20 may includesensors (e.g., ultrasonic transceivers) that send and receive signals bywhich a distance and changes in distance (e.g., direction) may bemeasured between the imaging device 100 and the subject S. Similarly,the subject S may include a transmitter (e.g., beacon) that sendssignals by which a distance and changes in distance (e.g., direction)may be measured (e.g., based on the time between sending and receivingthe signal).

The operation for the determining 650 of the combined motion estimateis, for example, performed by the combined motion estimation module 315according to the visual frame motion estimate, the imaging device motionestimate, and/or the subject motion estimate. The combined motionestimate is an estimate of the movement that the subject S will undergofrom the image frame IF_(t−1) to the future image frame IF_(t), or maybe the predicted frame position S_(PRED) of the subject S in the imageframe IF_(t). The visual frame estimation, the imaging device motionestimate, and/or the subject motion estimate are combined (e.g., fused)to determine the combined motion estimate. As referenced above,confidence values associated with each of the visual frame motionestimate, the imaging device motion estimate, and the subject motionestimate may be used, for example, to weight and/or filter each suchestimation in determining the combined motion estimate. For example, theimaging device motion estimate, the subject motion estimate, and/or therelative motion estimate may be used to account for motion of theimaging device and the subject S (e.g., egomotion) accounted for in thevisual motion estimate. For example, the imaging device motion estimate,the subject motion estimate, and/or the relative motion estimate may bedetermined as expected frame motion (i.e., a change of position of thesubject S in the image frame) and be added (e.g., in weighted orunweighted form) to the visual motion estimate. By combining the variousmotion estimates, the ROI_(t) the predicted frame location S_(PRED) maybe more accurate, thereby allowing the ROI_(t) to be sized smaller toprovide reduced computing time and/or reduced computing resources fortracking the subject S in successive image frames.

The operation for the determining 660 of the size and the location ofthe ROI_(t) is, for example, performed by the ROI determination module316 and includes determining a predicted frame location S_(PRED) of thesubject S in the image frame IF_(t) and locating the ROI_(t) relative tothe predicted frame location S_(PRED) (e.g., in a predeterminedlocation, such as centered on thereon).

The determining 660 also includes determining the size of the ROI_(t),which may include increasing or decreasing a size of the ROI_(t) ascompared to a previous ROI_(t−1). The size of the ROI_(t) may beincreased, for example, if the combined motion estimate indicates theimaging device 100 will be closer to the subject S, which would beexpected to appear larger in the image frame IF_(t) and possibly requireprocessing a larger portion of the image frame IF_(t) to locate thesubject S therein. The size of the ROI_(t) may also be increased, forexample, in circumstances in which the predicted location S_(PRED) maybe less reliable, for example, with faster movements (e.g., relativelylarge change between the predicted frame position S_(PRED) and theprevious frame position S_(POSt−1)) and/or relatively low confidencevalues being associated with each of the visual frame motion estimate,imaging device motion estimate, and/or the subject motion estimate.Alternatively, the ROI_(t) may be sized to a default size or may notchange in size for different image frames IF (e.g., have a fixed size,such as ¼, ⅛, or 1/16 of a total size of the image frames).

Variations of the techniques 400, 510, and 610 are contemplated. Forexample, in the technique 610, the determining 650 of the combinedmotion estimate may be omitted, and the determining 660 of the ROI_(t)may be performed directly with the visual motion estimate, the imagingdevice motion estimate, and/or the subject motion estimate. Furthermore,one or more of the operations for the determining 626, 634, and 644 ofthe various motion estimates may be omitted with the operation for thedetermining 650 the combined motion estimate or the operation for thedetermining 660 of the ROI being performed with the image frames and/ormotion information from the operations of obtaining 622, 632, 642.

One or more of the modules 310-317, 320, 330 and the techniques 400,510, and 610 can be performed and/or implemented, for example, byexecuting a machine-readable program or other computer-executableinstructions, such as instructions or programs described according toJavaScript, C, or other such instructions. The steps, or operations, ofthe modules or techniques, or any other technique, method, process, oralgorithm described in connection with the implementations disclosedherein can be implemented directly in hardware, firmware, softwareexecuted by hardware, circuitry, or a combination thereof, for example,of the MIA 20, the imaging device 100, the external device 50, and/orthe tracking system 60.

Trajectory Generation for Subject Tracking

Degrees of Freedom

Referring to FIGS. 7A-7C, a flight or tracking system 700 and a methodor technique 700 a performed thereby are provided for receiving userinstructions for moving the MIA 20 and the imaging device 100 relativeto a target T, so as to maintain the target T within the image frames ofimages captured by the imaging device 100. The tracking system 700 andthe method performed thereby may be included in and/or implemented byvarious components of the movable imaging system 10 (e.g., the MIA 20,the imaging device 100, the external device 50, the tracking system 60,etc.). For example, the tracking system 700 includes a module 710 (e.g.,a user input module) for receiving user inputs, for example, via theexternal device 50.

Once a subject or a target has been determined as present in a videostream as captured by an aerial subject tracking system or MIA 20, it isdesirable to automatically or semi-automatically accurately frame thesubject within the video image frames. For stationary targets, a manualframing may not be too difficult, once a manual control of the movableplatform 40 has been mastered. However, moving targets can present amuch more complex scenario, and a specific control becomes much moredifficult.

According to an implementation, an automatic or semi-automatic controlof the MIA 20 can be effected to operate within certain constraints.According to a first constraint, and referring to FIGS. 7A and 7B, whichare pictorial illustrations of the MIA 20 and the imaging device 100 ofthe MIA 20 with respect to a target T, when the target T moves, a motionof the MIA 20 can be defined as having the MIA 20 follow the target Twith a constant delta in altitude (e.g., vertical) and horizontalposition with respect to the target T. A constant delta in thehorizontal position can mean: a) the horizontal position of the target Tis fixed within the video image frames, that is, the MIA 20 moves as thetarget T changes direction of travel (e.g., the MIA 20 will remainbehind the target, and adapt automatically to changes in direction oftravel); or b) the horizontal position of the target T is fixed in a GPSframe, meaning the MIA 20 position is fixed irrespective of a directionof travel of target T. The motion of the MIA 20 may be described asrelative to a frame of reference (FOR) that is either a target T or afixed GPS framework.

A user may provide input to the MIA 20 via the external device 50 suchas the MIA controller and UI described in respect to FIG. 1. This mayallow control of, or selection of, e.g., five DOFs, three of which arerelated to control of the movable platform 40 relative to the target,and two of which are related to orientation of the imaging device 100with respect to the movable platform 40. That is, the user may selectthe position of the MIA 20 relative to target (e.g., MIA position) andthe position of the target within the image frame (e.g., target frameposition), while operation of the MIA 20 and the imaging device movementmechanism 30 (e.g., the gimbal) is performed automatically (e.g., by acontroller of the MIA 20 and/or the external device 50) to achieve theMIA position and the target frame position. As discussed in furtherdetail below, the user may select the MIA position of the MIA 20relative to the target directly (e.g., inputting specific values), via apredetermined flight pattern (e.g., choreographed flight pattern), orboth, and the user may select the frame position of the target in theimage frame directly, via predetermined scene selections, or both. Thedistances or coordinates of the MIA position and the frame position maybe referred to as user-selectable degrees of freedom or user-selectableconstraints. As also discussed in further detail below, the MIA 20(e.g., via a controller thereof and/or the controller) controls movementof the MIA 20 to achieve the MIA position and the frame position bycontrolling movement of the MIA 20 in real space (e.g., six degrees offreedom including translation in X-, Y-, and -Z axes, and yaw, pitch,and roll) and movement of the imaging device 100 relative thereto viathe imaging device movement mechanism 30 (e.g., in two or three degreesof freedom including yaw, pitch, and roll). Movement of the MIA 20 maybe referred to as occurring in MIA degrees of freedom (e.g., MIA DOFs),and movement of the imaging device 100 relative to the MIA 20 may bereferred to as occurring in imaging device degrees of freedom (e.g.,imaging device DOFs).

As illustrated in FIG. 7A, according to an implementation, the MIA 20can be set to operate according to: a) a first user-selectable DOF 740in which the MIA 20 moves in a radial direction towards or away from thetarget T (e.g., a horizontal distance between the MIA 20 and the targetT); b) a second DOF 741 in which the MIA 20 moves in a tangentialdirection (e.g., a circumferential or angular position of the target Trelative to the MIA 20), i.e., along a circular trajectory aroundtarget; and c) a third DOF 742 in which the MIA 20 moves in a verticaldirection or in altitude relative to the target T (e.g., a verticaldistance between the MIA 20 and the target T). As referenced above, thecircumferential position may be defined relative to a trajectory of thetarget T (e.g., 0 degrees being in front of the target and 180 degreesbeing behind the target) or a fixed frame of reference (e.g., GPScoordinates, such as 0 degrees being north, and 180 degrees beingsouth).

As illustrated in FIG. 7B, and according to an implementation, theimaging device 100 can be rotated by use of, e.g., the imaging devicemovement mechanism 30, such as a gimbal, to allow adjustment of theimaging device 100 within the MIA 20. The user input via the externaldevice 50 can thus be set to operate according to: d) a fourth DOF 743in which the vertical position of the target T may be adjusted withinthe video stream (e.g., image frame) by, e.g., pitching the imagingdevice movement mechanism 30 (e.g., a vertical frame position); and e) afifth DOF 744 in which the horizontal position of target T within camerastream may be adjusted by yawing the imaging device movement mechanism30 and/or the MIA 20 (e.g., a horizontal frame position). Theorientation of the image frame relative to the target T and/or relativeto a horizontal plane may be maintained by rolling, pitching, or yawingthe imaging device movement mechanism 30 (e.g., as the MIA 20 rolls,pitches, or yaws to achieve translational movement, as with a quadcoptertype device). By combining operations of all five user-selectable DOFs740, 741, 742, 743, 744 discussed above, the MIA 20 and the imagingdevice 100 can automatically adjust position (e.g., the horizontal,vertical, and circumferential positions of the MIA 20 relative to thetarget T) and orientation (e.g., the roll, pitch, and yaw of the MIA 20relative to the target T) together with the orientation (e.g., pitch,heading (i.e., yaw), and/or roll angles) of the imaging device 100relative to the MIA 20 (i.e., by operating the imaging device movementmechanism 30). This may ensure the correct placement of the target T orsubject within the image (e.gg., in the image frames) as well as thecorrect relative position of the MIA 20 with respect to the target T orsubject.

These user-selectable DOFs 740, 741, 742, 743, 744 (e.g.,user-selectable constraints) can be operated individually or incombination. The user-selectable DOFs 740, 741, 742, 743, 744 may beinput directly by the user and/or may be choreographed over time toproduce complex motion of the imaging device 100 relative to the targetT.

For example, for a first period of time, motion may be constrained tooperating solely within the second DOF 741, but then for a second periodof time, combined constraints of the first DOF 740, the third DOF 742,and fourth DOF 743 may be used in order to produce choreographedcinematic type video of the target T. The constraints may be implementedusing tracking techniques defined herein.

For example, the user may input the DOF 740 (e.g., the radial orhorizontal distance), the DOF 741 (e.g., circumferential or angularposition), and the DOF 742 (e.g., the vertical distance) individuallyand as fixed values. The user may also input a frame of reference bywhich the DOF 741 (i.e., the circumferential or angular position) isdetermined according to a trajectory of the target T or a fixedreference frame (e.g., GPS coordinates). The user may instead input oneor more of the DOFs 740, 741, and 742 in conjunction with achoreographed flight pattern (e.g., predetermined flight pattern) inwhich one or more of the other DOFs 740, 741, and 742 are variedautomatically. In one example, the user may input two of the DOFs 740,741, 742, while the third of the DOFs 740, 741, 742 is varied accordingto a choreographed flight pattern that is selectable by the user. Forexample, the user may input the DOF 740 (e.g., the horizontal distance)and the DOF 742 (e.g., the vertical distance) and select a choreographedDOF 741 by which the DOF 741 (e.g., the circumferential position) isvaried automatically (e.g., to orbit the target T at a predetermined,fixed, variable, or user-selectable speed). In another example, the usermay input one of the DOFs 740, 741, 742, while the other two DOFs 740,741, 742 are varied according to a choreographed flight pattern that isselectable by the user. For example, the user may input the DOF 741(e.g., the circumferential position) and select choreographed DOFs 740,742 by which the DOF 740 (e.g., the horizontal distance) and the DOF 742(e.g., the vertical distance) are varied automatically (e.g., to flyaway from and back toward the target T at predetermined oruser-selectable positions at a fixed or user selectable speed).

The user may input the DOF 743 (e.g., the horizontal target frameposition) and the DOF 744 (e.g., the vertical target frame position)individually and as fixed values. The user may specify a particularlocation, region, or bounding box within the image frame over which orin which the target T is to be positioned, for example, by inputting theDOF 743 and the DOF 744, and/or a size of a region or bounding box.

Further, the user may be guided or restricted in the DOF 743 and the DOF744 according to a setting of the imaging device 100, such as a framewidth setting. For example, the imaging device 100 may be configuredwith different settings for capturing images with different widths ofimage frames. For wider settings, the captured images may be subject togreater distortion moving closer to the edges of the image frames.Accordingly, the user may be guided to input the DOFs 743 and 744 whereless distortion would be expected, or may be restricted (i.e.,prevented) from inputting the DOFs 743, 744 where too great ofdistortion might be expected (e.g., for capturing quality images of thetarget T and/or for visually tracking the target T).

Still further, the user may input the DOF 743 and the DOF 744 accordingto a predetermined scene selection, for example, in which the target Tis positioned within the image frames according to the rule of thirds,as selected by the user.

The user may input the DOFs 740, 741, 742, 743, 744, for example, viathe external device 50 (e.g., using physical buttons, a touch screen,and/or voice inputs). Operation of the MIA 20 and the imaging devicemovement mechanism 30 to achieve the DOFS, 740, 741, 742, 743, 744 maybe controlled by the external device 50, a controller of the MIA 20,and/or a controller of the imaging device 100 (e.g., according toinstructions stored in memory and executed by a processor according touser input of the DOFs, various other information obtained from varioussensors (e.g., IMU, position sensors, GPS, or other metadata source144), and image information (e.g., from processing image frames capturedby the imaging device 100).

Referring to FIG. 7C, a block diagram is provided for the trackingsystem 700 in which the user may input the MIA position and the targetframe position and by which the MIA 20 and the imaging device 100 areoperated.

In a first module 710 user inputs are received (e.g., a user inputmodule). The user inputs may, for example, be received by the externaldevice 50. In a submodule 712 (e.g., a MIA position module), user inputsare received for the MIA position, which may include receipt of inputsfor the DOF 740 (horizontal distance), DOF 741 (circumferentialposition), and DOF 742 (vertical distance). As described above, thesubmodule 712 may receive inputs as one of more of (a) fixed values forDOFs 740, 741, 742, (b) fixed values for two of the DOFs 740, 741, 742and a user-selectable choreographed flight pattern by which the other ofthe DOFs 740, 741, 742 is varied, or (c) fixed values for one of theDOFs 740, 741, 742 and another user-selectable choreographed flightpattern by which the two other of the DOFs 740, 741, 742 are varied. Thesubmodule 712 may also receive a user input specifying a frame ofreference as either being fixed (i.e., fixed in real space) ortrajectory dependent (i.e., based on a trajectory of the target). Thesubmodule 712 may, for example, receive user the user inputs via theexternal device 50.

In a second submodule 714 (e.g., a frame position module), user inputsare received for the target frame position, which may include receipt ofinputs for the DOF 743 (horizontal frame position), the DOF 744(vertical frame position). As described above, the submodule 714 mayreceive user inputs as one or more of (a) a position, (b) a region, or(c) a bounding box within the image frame. The submodule 712 may alsoreceive user input of a size of the bounding box. Still further, thesecond submodule 714 may, based on an image frame width setting, guideor restrict to limited inputs for the DOFs 743, 744, or allow the userto select a scene selection by which the DOFs 743, 744 arepredetermined.

In a third submodule 716 (e.g., a camera mode module), the user mayinput a camera mode selection pertaining to an image frame widthsetting.

In a second module 720 (e.g., a sensor information module), sensor ormovement information is determined. In a first submodule 722, sensorinformation may be determined for the camera mode of (e.g., image framewidth setting) and/or an image stream from the imaging device 100. In asecond submodule 724 (e.g., a MIA and imaging device motion module),sensor or movement information is obtained regarding motion of the MIA20 and the imaging device 100 relative thereto, such as the positionand/or orientation of the MIA 20 in real space and the imaging device100 relative to the MIA 20 and changes (e.g., velocity) or rates ofchanges (e.g., acceleration) thereof. Such motion information may beobtained from sensors of the MIA 20 (e.g., IMU, GPS, altimeter, etc.),the imaging device movement mechanism 30 (e.g., position sensorsthereof), and/or the imaging device 100 (e.g., sensors thereof, such asan IMU or accelerometers, and/or derived from the image stream capturedthereby).

In a third module 730 (e.g., predicted target motion module)), predictedmotion and/or future positions of the target T is determined, forexample, according to the image stream (e.g., by identifying the targetT in image frame, determining positions of the target T in the imageframe, and determining changes in position of the target T in the imageframes).

In a fourth module 738 (e.g., motion determination module), desiredmotion for the MIA 20 and the imaging device 100 are determinedaccording to the predicted motion of the target T (i.e., from module730) and motion information of the MIA 20 and the imaging device 100relative thereto (i.e., from the submodule 724) to achieve theuser-selectable DOFS 740, 741, 742, 743, 744. For example, using amotion module of the MIA 20, desired motion of the MIA 20 is determinedaccording to the predicted motion of the target T and the motioninformation obtained and/or derived from submodule 724 (e.g., currentposition and orientation of the MIA 20 relative to the target T, changetherein, and rates of change therein) to achieve the DOFs 740, 741, 742(i.e., horizontal, angular, and vertical positions of the MIA 20relative to the target T) at subsequent times corresponding to thepredicted motion or positions of the target T. Desired motion of theimaging device 100 relative to the MIA 20 may be determined according tothe predicted motion of the target T and the desired motion of the MIA20, so as to achieve the DOFs 743, 744 (i.e., horizontal and verticalframe positions)

In a fifth module 739 (e.g., movement control module), the MIA 20 andthe imaging device movement mechanism 30 are controlled according toachieve the desired motion of the MIA 20 and the imaging device 100relative thereto. For example, in the case of the MIA 20 being aquadcopter, rotors of the MIA 20 may be rotated at different rates so asto yaw, pitch, and roll the MIA 20 to translationally move the MIA 20.In the case of the imaging device movement mechanism 30 being athree-axis gimbal, motors pivot the imaging device 100 relative to theMIA 20 about the three axes.

The modules 710, 720, 730, 738, 739 and submodules may be implemented byone or more of the tracking system 60, the MIA 20, the external device50, the imaging device 100 and/or various hardware components thereof(e.g., processors, memories, and/or communications components). Further,it should be understood that the various submodules may be standalonemodules separate from the parent module or other submodules associatedtherewith.

Referring to FIG. 7D, a set of operations of the method 700 a aredescribed for controlling the MIA 20 according to user inputinstructions. At 710 a, user inputs are received, for example by theexternal device 50. The user inputs include MIA position inputs, whichare used to define a position of the MIA 20 with respect to the targetT, and frame position inputs, which are used to define a position of thetarget T within image frames, such as those captured by the imagingdevice 100. As 712 a, the MIA position inputs are received, which mayinclude one or more degrees of freedom or constraints according to whichthe MIA 20 is to be moved relative to the target T. The MIA positioninputs may define one or more of a vertical distance, a circumferentialposition, or a vertical distance between the MIA 20 and the target T.One or more of the MIA position inputs may be received as fixed valuesthat define the horizontal distance, the circumferential position, orthe vertical distance. One or more of the user inputs may be received asa selection of a choreographed flight pattern by which another of thehorizontal distance, the circumferential position, or the verticaldistance are varied. The MIA position inputs may additionally define aframe of reference of the MIA 20, for example, as being relative to afixed reference frame or a trajectory of the target.

At 714 a, the frame position inputs are received, which may include oneor more degrees freedom or constraints according to which the target Tis to be positioned within image frames captured by the imaging device100. The frame position inputs may define one or more of a horizontalposition or a vertical position of the target within the image frame.One or more of the frame position inputs may be received as fixed valuesthat define the horizontal position or the horizontal position, whichmay include defining a position of the target T (e.g., a pixellocation), a bounding box (e.g., a region constrained in horizontal andvertical dimensions), or another region (e.g., a horizontal or verticalregion). When receiving the frame position inputs, the user may beguided or restricted to provide frame position inputs according to animage frame width.

At 716 a, user inputs may also be received to specify a camera mode thatdefines an image frame width setting.

At 720 a, movement information is determined for the MIA 20 and for thetarget T. At 722 a, target movement information is determined for thetarget T, which may be derived from image frames captured by the imagingdevice 100 and processed to locate the target T within the imagineframes and/or to locate the target T with respect to the MIA 20. Targetmovement information may, for example, include a position and/orvelocity of the target T relative to a reference frame and/or the MIA20.

At 724 a, MIA movement information is determined for the MIA 20, whichmay be collected from sensors associated therewith (e.g., the metadatasources 144, thereof, which may include an IMU, GPS sensor,accelerometers, gyroscopes, altimeters, etc.). MIA movement informationmay, for example, include position and velocity of the MIA 20 relativeto a reference frame (e.g., translational movement) and may also includeorientation and orientation change rates of the MIA 20 relative to thereference frame (e.g., roll, pitch, and/or yaw). MIA movementinformation may be used to determine the target movement information,for example, by accounting for changes in position and/or orientation ofthe MIA 20 when evaluating motion of the target T between the imageframes.

At 730 a, target motion of the target T is predicted according to thetarget movement information. For example, a position of the target T(e.g., a predicted or future target position) may be predicted ordetermined for one or more future times. For example, the predictedtarget position may be determined according to a previous position andvelocity of the target T (e.g., past target positions and targetvelocity).

At 738 a, desired motion (e.g., movement instructions) of the MIA 20 andthe imaging device 100 relative thereto is determined according to thepredicted target position and the MIA motion information to achieve theMIA position inputs (e.g., the horizontal distance, circumferentialposition, vertical distance, and/or frame of reference of the MIA 20relative to the target T) and the frame position inputs (e.g., thehorizontal position and the vertical position of the target T withinimage frames) at the one or more future times.

At 739 a, the MIA 20 and the imaging device 100 are moved to achieve thedesired motion of the MIA to achieve the MIA position inputs and theframe position inputs. For example, the movement instructions areexecuted to operate the MIA 20 and the imaging device movement mechanism30.

Steps 720 a to 739 a are then repeated to continue to achieve the MIAposition inputs and the frame position inputs. Step 710 a may berepeated to receive new user inputs.

Flight Restriction Volumes

Referring to FIGS. 7E-7G, a flight or tracking system 700′ and a methodor technique implemented thereby are provided to movement of the MIA 20within restricted areas defined relative to the target T (e.g., fortracking and/or collision avoidance purposes. The tracking system 700′and the method performed thereby may be included in and/or implementedby various components of the movable imaging system 10 (e.g., the MIA20, the imaging device 100, the external device 50, the tracking system60, etc.).

It may be desirable to create certain flight restriction volumes orzones in order ensure the safety of the user and at the same time ensurethat the tracking system associated with the MIA 20 continues tofunction robustly. To that end, regardless of other MIA 20 motiontrajectories or constraints, a further delineation of allowable andnon-allowable volumes relative to a target may be defined within whichflight is permitted or not permitted, respectively. These allowable andnon-allowable volumes may override other calculations of trajectoriesfor the MIA 20 in order to maintain safety of persons or property(including the MIA 20), or to ensure that the subject S remains withinview of the imaging device 100.

FIG. 7E is a pictorial perspective view of the MIA 20 operating outsidepredefined restricted zones 745, or within predefined volumes. Arestricted zone 746 (e.g., a first volume) may be defined as anoutermost boundary outside of which the MIA 20 may operate. Conversely,In one implementation, this restricted zone 746 could be, e.g., ahalf-sphere (or approximation thereof) whose surface constitutes apredefined maximum distance allowable from the MIA 20 to the target T toensure that the tracking system 700′ does not lose the target T (e.g.,is able to locate the target T, for example, using direct or indirectwireless communication between the target T and the MIA 20 and/or visualidentification of the target T in successive image frames obtained bythe imaging device 100). This first restricted zone 746 could alsoinclude a boundary that ensures that a distance between the MIA 20 andthe external device 50 or the subject S (e.g., when using a GPS positionof the subject), when a direct wireless link exists, does not exceed amaximum range of the wireless connection. The maximum range can bevariable and can be a function of the number of other devices operatingwithin a same Wi-Fi frequency spectrum or may be based on other factorsthat can impact transmission distances. A margin of safety may beapplied to any of the volumes, surfaces, or surface point distancesdiscussed herein. Other constraints may also be incorporated into thedefinition of the restricted zone 746, such as no-fly zones, etc., suchas conversely defining a first volume in which the MIA 20 is permitted.

A second restricted zone 747 may be defined by, e.g., a cylinder, whosesurface represents a minimum distance to the target T and within whichconstitutes a no-fly zone around the subject to ensure the safety of thesubject. Finally, a restricted zone 748 (e.g., a conical region) may bedefined to account for a maximum extent of pitch permitted for theimaging device 100 with respect to the MIA 20 in order to ensure thatthe tracking system 700′ does not lose the target T (e.g., that thetarget T is not outside a field of view of the imaging device 100). Forexample, the imaging device 100 may, by the imaging device movementmechanism 30, have a limited range of motion relative to the MIA 20 thatresults in regions below the MIA 20 outside the field of view of theimaging device 100. The third restricted zone 748 is a region relativeto the target T, which the MIA 20 is avoided or prevented from flyinginto in order to maintain the target T within the field of view of theimaging device 100. This restricted zone 748 may be defined as a cone,and operation of the MIA 20 within this cone may be avoided.

These restricted zones 746, 747, 748 may also be designed to take intoconsideration motion of the target T in the image caused by the motionof the MIA 20. This motion may be kept within certain predefined limitsto ensure proper operation of the tracking system. In other words,changes in speed and direction of the MIA 20 may be constrained to occurbelow a certain change rate if the MIA 20 is operating in a mode whereit tracks the target T. If a motion estimate of the target T isavailable, this information may be incorporated to reduce the maximalallowed motion.

If a trajectory of the MIA 20 established by other criteria would causethe MIA 20 to enter a non-allowed volume, the trajectory may be modifiedso that it remains within an allowed volume. For example, the trajectoryof the MIA 20 may be modified to include a point within the allowedvolume nearest a point of the original trajectory that was within anon-allowed volume.

Referring to FIG. 7F, a block diagram is provided for a flight ortracking system 700′ that implements a technique of method 700 a′ bywhich the restricted flight zones are utilized.

In a first module 710′ (e.g., motion prediction module), predictedmotion of the restricted zones 746, 747, 748 and the MIA 20 aredetermined. In a first submodule 712′ (e.g., restriction zone motion ortarget motion module), predicted motion of the restricted zones 746,747, 748 is determined by predicting motion of the target T. Forexample, motion of the target T may be predicted according to pastpositions of the target T relative to a reference frame (e.g., GPScoordinates) or the MIA 20, which may have been determined visually(e.g., according to identifying and locating the target T in past imageframes captured by the imaging device 120) and/or sensor information(e.g., obtained by sensors associated with the target T, the MIA 20,and/or the imaging device 100). In a second submodule 714′ (e.g., MIApredicted motion module), predicted motion of the MIA 20 is determinedaccording to intended flight instructions. The intended flightinstructions may, for example, include user-defined flight instructions(i.e., based on inputs from a user, such as for translational movementin vertical and horizontal directions) and/or automated flightinstructions (e.g., for the MIA 20 to follow the target T). Thepredicted motion of the MIA 20 may be determined, for example, accordingto a motion model of the MIA 20 and the intended flight instructions,for the subsequent times. The predicted motion of the MIA 20 may also bedetermined according to motion information of the MIA 20 (e.g., positionand/or orientation, changes therein, and/or rates of change therein),which may be determined according to the image stream of the imagingdevice 100 and/or sensors of the MIA 20 (e.g., IMU, GPS, altimeter,etc.) and accounted for in the motion model. Motion of the target Tand/or the MIA 20 may be determined in the manners described above withrespect to the tracking system 300 and the technique 400.

In a second module 720′ (e.g., flight intrusion module), it is predictedwhether the predicted motion of the MIA 20 and the restricted zones 746,747, 748 will result in the MIA 20 flying (e.g., intruding) into therestricted zones 746, 747, 748. In a first submodule 722′ (e.g., maxdistance module), it is determined whether the predicted motion wouldresult in the MIA 20 flying into the restricted zone 746 (e.g., outsidea radial distance from the target T, such as a distance at which thetarget T can no longer be tracked or identified in image frames or otherdistance value). In a second submodule 724′ (e.g., a minimum distancemodule), it is determined whether the predicted motion would result inthe MIA 20 flying into the restricted zone 747 (e.g., inside a radial orcircumferential distance from the target T, such as a distance toprevent inadvertent collisions between the MIA 20 and the target T. In athird submodule 726′ (e.g., overhead module), it is determined whetherthe predicted motion would result in the MIA 20 flying into therestricted zone 748 (e.g., inside a region in which the target T will ormay be outside the field of view of the MIA 20, such as due to travellimits of the imaging device movement mechanism 30). It should be notedthat fewer or more restricted zones may be defined relative to thetarget T, such that fewer or more modules may be utilized. Further, asingle module may cooperatively determine whether the prediction motionwould result in the MIA 20 flying into any of the multiple restrictedzones.

In a third module 730′ (e.g., flight instruction module), executableflight instructions are determined. In a first submodule 732′ (e.g.,intended flight module), if the predicted motion of the MIA 20 isdetermined to not take the MIA 20 into the restricted zones 746, 747,748, the intended flight instructions are determined to be theexecutable flight instructions. In a second submodule 734′ (e.g.,modified flight module), if the predicted motion of the MIA 20 isdetermined to take the MIA 20 into one of the restricted zones 746, 747,748, modified instructions are determined to be the executable flightinstructions. The modified flight instructions are different than theintended flight instructions and which are predicted to not take the MIA20 into any of the restricted zones 746, 747, 748.

In a fourth module 738′ (e.g., movement module), the MIA 20 iscontrolled according to the executable flight instructions.

The various modules 710′, 720′, 730′, 738′ and the submodules thereofmay be implemented by one or more of the tracking system 60, the MIA 20,the external device 50, the imaging device 100 and/or various hardwarecomponents thereof (e.g., processors, memories, and/or communicationscomponents). Further, it should be understood that the varioussubmodules may be standalone modules separate from the parent module orother submodules associated therewith.

Referring to FIG. 7G, a set of operations of the method 700 a′ aredescribed for controlling the MIA 20 according to restricted flightzones.

At 710 a′, motion of one or more restricted zones 746, 747, 748 and theMIA 20 are predicted. At 712 a′, motion of the one or more restrictedzones is predicted. For example, future position(s) of the one orrestricted zones is determined or predicted for one or more future time.Because the restricted zones are defined relative to the target T, inpredicting motion (e.g., future positions) of the restricted zones,motion (e.g., future positions) of the target T may be predicted. Motionof the target T may be performed as described previously, for example,by determining the position and velocity of the target T according toimages frames previously captured by the imaging device 100. Motioninformation of the MIA 20 may be used to determine target motion, forexample, by taking into account position, velocity, orientation, andchange in orientation of the MIA 20 determined, for example, accordingto metadata sources 144 (e.g., various movement sensors) associated withthe MIA 20.

The restricted zones may include one or more of the restricted zones746, 747, 748. The restricted zone 746 may define a maximum allowabledistance between the MIA 20 and the target T (e.g., outside of whichtravel is restricted). The restricted zone 747 may define a minimumallowable distance between the MIA 20 and the target T (e.g., inside ofwhich travel is restricted). The restricted zone 748 may define a regionoverhead or above the target T (e.g., inside of which travel isrestricted), which may be a region in which the target T may be outsidea field of view of the imaging device 100.

At 714 a′, motion of the MIA 20 is predicted according to intendedflight instructions. For example, future positions of the MIA 20 arepredicted or determined for future time(s) corresponding to those of thefuture time(s) associated with the predicted positions of the restrictedzones 746, 747, 748. Predicted motion of the MIA 20 may be determined asdescribed previously, for example, according to a motion model MIA 20that predicts movement of the MIA 20 according to movementcharacteristics of the MIA 20, the intended flight instructions, and theMIA motion information.

At 720 a′, it is determined or predicted whether the prediction motionof the MIA 20 will result in the MIA 20 travelling into the restrictedzones 746, 747, 748 at the future times. For example, the futureposition of the MIA 20 is compared to the restricted zones 746, 747, 748at their respective predicted positions for the future time(s). Theintended flight instructions may, for example, be input manually by theuser or those performed according to choreographed flight maneuvers.

At 730 a′, executable flight instructions are determined. If at 720 a′,the MIA 20 is predicted to not travel into one of the restricted zones746, 747, 748 at the future time, the intended flight instructions aredetermined to be the executable flight instructions. If at 720 a′, theMIA 20 is predicted to travel into one of the restricted zones 746, 747,748 at the future time, modified instructions are determined to be theexecutable flight instructions. The modified instructions are predictedto not result in the MIA 20 traveling into the restricted zones 746,747, 748.

At 738 a′, the MIA 20 is controlled according to the executable flightinstructions.

Scene Composition and Framing Preservation

Cinematography benefits significantly from utilizing composition andframing techniques that have been historically developed. Suchtechniques can be applied with regard to the images and video obtainedby use of the MIA 20. This introduces greater complexity than simplyidentifying and keeping track of a single subject or target T, as it mayinvolve cinematic framing and trajectory by defining, identifying,and/or detecting a subject, multiple subjects and/or a scene and/or acinematic element such as a backlight, horizon, or other compositionalaspects. The following techniques may be applied to the system.

First, consideration may be given to placement of a target T within aparticular scene. Determining which features form parts of the scene canbe useful so that the target T can be in front of the scene andpreferably not obscured by parts of the scene during movement. Backlightmay be considered to be in front of the scene and behind subject(s), andthe maintenance of backlight (or any other particular form of lighting)can be set as a parameter constraining motion. Fixtures or stationaryobjects may be considered as located in a fixed place throughout a scenewhereas subjects may be considered as dynamic actors within a scene.

FIG. 8 is a pictorial representation of a video image frame 630 d thatillustrates an application of the rule of thirds, which is splitting aframe into a three by three grid that defines ideal placement forvarious elements within the frame as shown. The imaging device 100 maybe positioned to maintain the horizon at an upper third position withinthe frame 630 d, here, along a topmost horizontal grid line, and thetarget T within the left third of the frame 630 d. In other applicationsof the rule of thirds, the horizon may be locked along the other of thehorizontal grid lines and the target T can be captured so as to belocated near various intersections of horizontal and vertical gridlines.

Other known compositional techniques may be further applied, such as thegolden ratio, use of diagonals, element balancing, leading lines,symmetry and patterns, and use of negative space, and/or othertechniques. A composition can ensure that there is adequate headroom forthe subject, i.e., that the subject is framed such that ratios betweensubject features, top of subject, and top of frame form a reasonableratio. Ratios may be sustained as the subject moves through the frameand as the imaging device 100 moves, for example, within or along withthe MIA 20. Furthermore, a composition can ensure that there is adequatelead room, i.e., adequate space in front of a subject's motion orsubject's heading.

All of the compositional techniques may be stored in a library alongwith algorithms and/or parameters used to define and implement thetechniques. One or more of these compositional techniques by beselectable and operable simultaneously.

Any of the techniques described above for determining motion of theimaging device 100 or predicting or restraining motion of the subject S(or the target T) may be applied to creating and maintaining thecompositional features described above. By way of example only, applyingthe constraints as described above with respect to FIGS. 7A and B may beutilized to create these specific compositional features.

Voice Command Tracking

Referring to FIGS. 9A-10, a voice recognition system 70 orvoice-controlled tracking or flight system 900 and a method or technique900 a are provided for a user or operator to control movement of the MIA20 using voice commands. The tracking system 700′ and the methodperformed thereby may be included in and/or implemented by variouscomponents of the movable imaging system 10 (e.g., the MIA 20, theimaging device 100, the external device 50, the tracking system 60,etc.).

When using visual tracking in a dynamic scenario (e.g., during actionsports), the operator of the MIA 20 may not have the time or may notwish to control the subject tracking via physical (e.g., “hands-on”)operation of the external device 50. This may occur in scenarios wherean operator of a tracking system 60 is also the target T that is beingtracked, such as a rider on a mountain bike, skate board, or surfboard.

FIG. 9A is a block diagram of an implementation of a voice recognitionsystem 70 that may be utilized to perform the desired subject trackingwithout requiring, or by reducing, an amount of operator physicalinteraction with the external device 50. According to an implementation,the operator of the MIA 20 may carry or wear a microphone 701 connectedto a voice recognition unit 703 that interprets audio or voice commands702 from the operator and relays valid tracking commands 704 obtainedfrom a command database 705 to the tracking system 60 of FIG. 1. Thevoice recognition unit 703 may comprise a speech-to-text converter unit.A searching algorithm can locate commands associated with the convertedtext in the command database 705 containing valid commands. Themicrophone 701 and/or the voice recognition unit 703 may, for example,be or be incorporated in the external device 50 or another device.

Using the voice commands 750, the operator may direct the MIA 20 toexecute a wide variety of manners including, for example, basic flightand tracking instructions or commands. Commands for basic flightoperations (e.g., basic flight commands) may, for example, pertain tostarting flight (e.g., takeoff of the MIA 20 from a landed or homeposition) or ending flight (e.g., returning and landing the MIA 20 tothe landed or home position). Such basic commands may be initiated basiccontrol commands, such as “startup,” “shutdown,” or “stop,” using thevoice commands 750.

Tracking commands may, for example, include flight maneuver instructionsand/or target identification instructions. Flight maneuver instructionsmay, for example, pertain to scripted flight maneuvers, which may bereferred to herein as “ProMoves,” and execute control over the MIA 20 tofly in a partially or wholly predetermined manner relative to the targetT. Such scripted or predetermined flight maneuvers may, for example,including orbiting the target T, flying away from and back to the targetT, or other predetermined flight maneuver (e.g., a user customizedflight pattern). Flight maneuver instructions in the voice command 702may specify further characteristics of the scripted flight maneuver,such as by specifying an orientation, relative vertical distance,relative horizontal distance, and relative speed of the MIA 20 to thetarget T. Target identification instructions in the voice command 702allow the operator to specify which, of multiple subjects S, is to bethe target T that the MIA 20 is to track or follow. As discussed below,the targets T may be pre-identified or may be identified duringoperation by a characteristic thereof identifiable with the imagingdevice 100.

In one example, the predetermined flight maneuver may be an orbitmaneuver in which the MIA 20 orbits around the target T or a point ofinterested (POI). In an example where the voice command 702 includes“execute orbit at five meters altitude above the target T or a point ofinterest (POI) with a ten meter radius,” the tracking system 60 mayinstruct the MIA 20 to move to a height of five meters above the targetT and then continuously move tangentially about the target T at adistance of ten meters.

In another example, the predetermined flight maneuver may cause the MIA20 to fly away and/or upward from the target T and/or may fly towardand/or downward toward target T. The voice command 702 may also instructthe MIA 20 to be positioned at an altitude five meters above the groundor to operate a “dronie” ProMove where the MIA 20 is directed to pointat the target T or the POI and then fly backwards/upwards, etc. (e.g.,to fly away and upward from the target T).

A variety of measurement units may be utilized. For example, the unitsof feet and meters may be mixed together in a single command, and thevoice recognition unit 703 or the tracking system 60 could convert themixed units to a standardized set of units accordingly. Also, specificsas to a number of repeated operations could be received as part of thevoice command 702, such as “execute orbit twice.” In the eventinsufficient parameters are supplied to generate a complete command(e.g., the “ten meter radius” was omitted from the above voice command702), the operator could either be voice prompted for the additionalinformation and/or some predefined default value could be used.

Absolute distances may be used in the voice commands 750 (e.g., “executeorbit at five meters”) as well as relative distances (e.g., “executeorbit five meters higher”). In the event that a direction of travel orthe orientation of the subject is available, the operator may also givevoice commands 750 that take this information into account. For example,the voice command 702 can include language such as “take a shot from myright side.” The above voice commands 750 are presented as examples, butdo not constitute a comprehensive list of voice commands 750.

In another example, the predetermined flight maneuver may be to trackthe target T. For example, the operator may state “track” or “follow” asthe voice command 702 in which case the MIA 20 follows the target T asthe target moves in real space. The operator may also provide theorientation command as part of the voice command 702, for example, toinstruct the MIA 20 to fly in an orientation relative to the movement ofthe target (e.g., rearward, forward, rightward, or leftward thereof) orin an orientation relative to a reference frame in real space, such asGPS coordinates (e.g., north, east, south, or west thereof). Stillfurther, the operator may provide a position command as part of thevoice command 702, for example, to fly in a particular spatialrelationship (e.g., vertical height and/or horizontal distance) relativeto the target T, as described above.

As reference above, the voice command 702 may include a targetidentifying instruction or command. FIG. 10 is a pictorial diagram of atarget T comprising a plurality of selectable subjects S₁-S_(n) for usein describing implementation examples for the voice recognition system70 of FIG. 9A. In addition to focusing on a single subject S as a targetT, the voice commands 750 sent to the voice recognition system 70 mayspecify a collection of subjects S₁-S_(n) as the target T and/or be usedto switch focus between several subjects S₁-S_(n).

The specifying of subject(s) S as targets T may be performed in at leasttwo ways: teaching and object recognition. In a first way (teaching),before a shot is taken, a teach-in phase during which the trackingsystem 60 learns characteristics of each subject, which may be laterused for identifying the subject S as the target T (e.g., when receivingthe voice commands 750 with a target identifying instruction). Duringthe teach-in-phase, identifying characteristics of each subject S₁-S_(n)are learned by the tracking system 60. For example, the MIA 20 may orbiteach subject S₁-S_(n), capture images subject S₁-S_(n), and process theimages to identifying various characteristics, such as a type (e.g.,human, vehicle, etc.), color, and other suitable identifyinginformation, for example, using suitable object recognition algorithms.Each subject S₁-S_(n) may also be assigned a unique ID and/or voiceidentifier (e.g., name), which are associated with the identifiablecharacteristics thereof. Object recognition algorithms may be utilizedto associate the subject S with its assigned ID. Then, in an operationalphase, the operator may switch the focus of the tracking system 60during the shots to different subjects S₁-S_(n) using the voice commands750, such as “switch focus to subject S₁.” During the teach-in phase,instead of assigning unique IDs, actual names could be assigned to thesubjects S₁-S_(n) to make operation simpler for the operator (e.g.,“switch focus to Alex”). For example, when receiving voice commands 750that include target identifying instructions, the tracking system 60 mayidentify the subject S instructed to be the target T according to one ormore the various identified characteristics. For example, a humanrecognition algorithm may be used to detect humans viewed by the imagingdevice 100, while further characteristics (e.g., color) may be used todistinguish the desired subject S from other human subjects S.

In a second way (object recognition), visual cues about objects may beused to select the subject(s) S₁-S_(n). Object attributes such as colormay be used (“switch focus to the person in the red shirt”). Objectattributes such as position (“switch focus to the object in the lowerleft-hand of the screen”) and shape may also be used, and these variousobject attributes may also be used in combination (“switch focus to theperson with long, straight, brown hair”).

In one example, the user may utilize the voice commands 750 within aplanned or scripted shot or scene that may be planned out in advanceusing, e.g., software planning tools, so that cues may be given to movethrough the shot. An example shot might be one that statically framestwo subjects S₁, S₂, then follows subject S₁ for ten seconds, thenfollows subject S₂ for five seconds, then pans out to frame bothsubjects S₁, S₂ with a horizon and other background elements of thescene. Such cinematic control could thus be integrated as part of thevoice recognition system 70, and the composition of the shot may becontrolled with commands such as: “places,” “action,” “next scene,”(another) “next scene,” “cut,” “take it from the top,” “take it from‘pan out.’” In this way, it is possible to create relativelysophisticated videos without requiring a high degree of physicalinteraction with the external device 50.

The types of control discussed above may be applied even when acontrollable UAV is not used as part of the MIA 20. For example, whenthe imaging device 100 is connected to the imaging device movementmechanism 30, such as the gimbal mechanism discussed above, but there isno movable platform 40 or it is not one that is remotely controllable(e.g., a downhill skier uses the imaging device 100 with the imagingdevice movement mechanism 30 mounted to the skier's helmet or handheldby the skier), various types of the voice commands 750, such as subjectselection and the like may still be utilized.

Referring to FIG. 9B, a block diagram of a voice-operated trackingsystem 900 is shown. In a first module 910 (e.g., a voice commandreceiving module) a voice command is received. The voice command, suchas the voice command 702, is received from an operator, for example,with the microphone 701 or other listening device. The voice command 702may be received from an operator that is the target T or is associatedtherewith (e.g., if the target T is a vehicle in which or on which theoperator is riding) and pertain to another subject S. As describedabove, the voice commands 750 may, for example, include one or more ofbasic flight instructions and tracking flight instructions. The trackingflight instructions may include one or more of flight maneuverinstructions (e.g., orbit, back and forth, or track or follow),orientation instructions (e.g., left, south, etc.), positioninstructions (e.g., vertical and/or horizontal distance and/or speed),and/or target identifying instructions (e.g., to switch betweensubjects), as described above.

In a second module 920 (e.g., a voice command interpreting module), thevoice command is interpreted, for example with the voice recognitiondevice 703, to process the audio of the voice command and correlate thevoice commands (e.g., basic flight instructions, or the tracking flightinstructions, including flight maneuver, orientation, position, and/ortarget identifying instructions) to operational commands. For example,the interpreted voice command may be correlated to the command database705, from which the operational commands are determined. The operationalcommands may include one or more of flight maneuver commands (e.g.,orbit, back and forth, or track or follow), orientation commands (e.g.,left, south, etc.), position commands (e.g., vertical and/or horizontaldistance and/or speed), and/or target identifying commands (e.g., toswitch between subjects).

In a third module 930 (e.g., tracking execution module), the operationalcommands are executed by the MIA 20, the imaging device movementmechanism 30, and/or the imaging device 100 to move the MIA 20 relativeto the target T and the imaging device movement mechanism 30 to executethe flight maneuver and maintain the target T in image frames capturedby the imaging device 100. In executing the operational commands, thetracking system 900 may, when executing the flight maneuver with respectto a different subject S that has become the target T, identify thedifferent subject S according to pre-identified characteristics (e.g.,learned during a teach-in-phase) or object recognition (e.g., otheridentifiable characteristic, such as color).

The various modules of the imaging and tracking system 1210 and thedisplay systems 1220, 1220′, 1220″ may be implemented by one or more ofthe tracking system 60, the MIA 20, the external device 50, the imagingdevice 100 and/or various hardware components thereof (e.g., processors,memories, and/or communications components).

Referring to FIG. 9C, a set of operations of the method 900 a′ aredescribed for controlling the MIA 20 according to voice commands.

At 910 a, voice commands or instructions are received. For example,voice commands may be received by the external device 50 or anotherdevice with the microphone 701. The voice instructions may include oneor more of basic flight instructions received at 912 a or trackingflight instructions received at 914 a. The tracking flight instructionsmay include one or more of flight maneuver instructions, positioninstructions, or target identifying instructions. The basic flightinstructions may, for example, be for takeoff and landing of the MIA 20.The flight maneuver instructions may include instructions for the MIA 20to fly in a partially or wholly predetermined flight pattern relative tothe target T (e.g., track/follow, orbit, etc.). The positioninstructions may include instructions for the MIA 20 to fly at desiredpositions relative to the target T

At 920 a, the voice instruction is interpreted, for example, with thevoice recognition device 703. For example, audio of the voiceinstructions may be processed and correlated to operational commands. At922 a, the audio of the voice instruction is processed to interpret thevoice instruction, for example, using a voice detection algorithm. Theaudio may be processed by the external device 50 or other deviceassociated with the microphone 701, or may be sent to the externaldevice 50 or MIA 20 for processing thereby. At 924 a, the voiceinstruction is correlated to operational commands. The operationalcommands may, for example, be contained in the command database 705,which may be stored by the external device 50, other device associatedwith the microphone 701, or the MIA 20. The operational commands mayinclude basic flight commands or tracking flight commands, the latter ofwhich may include flight maneuver commands, position commands, and/ortarget identifying command.

At 930 a, the operational commands are executed by the MIA 20. Forexample, the operational commands may be sent to from the externaldevice 50 or another device associated with the microphone 701 to theMIA 20, as the case may be. The operational commands may then beexecuted by the tracking module 60 or other suitable tracking or flightmodule described herein (e.g., 300, 700, 700′, etc.). For example, basedupon a voice instructions to takeoff the MIA 20 begins flight, to landthe MIA 20 lands or returns to a home position, for a flight maneuverinstruction flies the MIA 20 according the predetermined flight patternassociated therewith, for a position instruction flies the MIA 20 in theinstructed position relative to the target T, and/or for a targetidentifying instruction flies may change to target T to another subjectS and executes flight instructions relative thereto (e.g., flightmaneuver and/or position).

Operations 910 a to 930 a are repeated upon receiving new voiceinstructions.

Ultra-Wide-Band Localization using a Beacon Schema

A GPS device may be mounted to the MIA 20 and to the target T. Theabsolute positions of each may be read from the GPS devices and then arelative position between the two may be determined. However, theaccuracy of GPS devices, particularly in measuring altitude, isgenerally limited and not sufficient to allow precise subject trackingcontrol. It is desirable, when performing aerial subject tracking, toaccurately know the position of the target T with respect to the MIA 20.The use of GPS beacons, i.e., devices that use GPS satellites todetermine position and then broadcast that position to other GPSbeacons, may be applied in the context of aerial subject tracking.

FIG. 11A is a pictorial representation of an implementation of the MIA20 tracking a target T. In order to improve the accuracy in measuring adistance between the target T and the MIA 20, the system illustrated inFIG. 11 may utilize a set of ultra-wide-band transceivers (UWBTs) 800a-c (collectively or representatively, 800), 802 to directly estimate arelative position and velocity of the target T with respect to the MIA20. This may be done by distributing anchor UWBTs 800 along the body ofthe MIA 20 and a target UWBT 802 to the moving target T, for example, byaffixing three or more UWBTs 800 a-800 c with a known position (withrespect to the MIA 20) on the MIA 20. Additionally, in thisimplementation, the target T has one additional target UWBT 802 affixedto it.

This implementation presents a low-cost approach to create a localrelative position measurement system that can determine a distancebetween the MIA 20 and the movable target T having considerableaccuracy. The accurately determined distance can then be provided to thetracking system 60 or other components of the MIA 20. Rather than usingthe UWBTs in static scenarios where a set of anchor UWBTs aredistributed on the ground, the anchor UWBTs 800 are positioned such thatall are movable with respect to a fixed-frame (e.g., earth-based)reference coordinate system, as is the target UWBT 802. Thus, thisimplementation performs subject tracking without requiring the use ofstatic beacons. Static beacons may take time (and effort, in difficultenvironments) to place, set up, initialize, and/or configure, and theuse of the MIA 20 may be restricted to locations close to where thestatic beacons are placed. A device that determines and analyzespositions calculated from the UWBTs 800, 802 can be located on the MIA20 or the target T.

To perform subject tracking in this implementation, a distance betweenthe target UWBT 802 on the target T and each of the anchor UWBTs 800a-800 c anchored on the MIA 20 may be measured by a knowntime-of-arrival approach. For instance, at substantially the same time,one or more of the anchor UWBTs 800 a-c may transmit a respective signal804 a, 804 b, and 804 c. Based on the times of travel for the signals804 a-c and a velocity thereof, individual distances between the anchorUWBTs 800 a-c and the target UWBT 802 may be determined. From theindividual distances, a relative position of the target T may be derivedusing, e.g., known sphere intersection techniques for four or more UWBTs800 a-c, 802 serving as anchors or known triangulation techniques (whereonly three UWBTs 800 a-c serve as anchors).

By employing phase shift approaches, a relative direction of the targetT with respect to the MIA 20 may be derived. This becomes more usefulonce a position estimate degrades due to conditioning issues (e.g., asmall anchor baseline relative to the subject-UAV distance). By fixingan inertial measurement unit on one or more of the UWBTs 800 a-800 c ofthe MIA 20 and/or the target UWBT 802 on the target T, relative positionestimates may be improved. In addition, relative velocity estimates maybe improved, both in terms of relative positions (between the target Tand the MIA 20) and absolute positions (with respect to an earthframework).

FIG. 11B illustrates an example system for tracking the movement of atarget. The example system may include a set of three or more anchorUWBTs 800 a-c, a target UWBT 802, a position sensor system 810, an MIAposition determination module 812, a target position determinationmodule 814, the tracking system 60, an MIA control module 816, and theimaging device movement mechanism 300. The system may include additionalcomponents not discussed. Furthermore, two or more of the componentsdisclosed above may be integrated into a single component.

The position sensor 810 system may include one or more sensors thatoutput signals that indicate a position of the MIA 20. For example, theposition sensor system may include one or more metadata sources 144,such as one or more IMUs, accelerometers, gyroscopes, and/or a globalpositioning system (GPS). In implementations where the position sensorsystem 810 include accelerometers and/or gyroscopes, the position sensorsystem 810 may output position signals that are relative to a startinglocation. For instance, if the flight of the MIA 20 begins at a startinglocation that is assigned the coordinates (0,0,0), the position sensorsystem 810 may output position signals that indicate the positionrelative to the starting location. In this example, the coordinates area triple that indicates an east/west value (e.g., along an x-axis), anorth/south value (e.g., along a y-axis), and an altitude (e.g., along az-axis), where the units are measured in any suitable metric (e.g.,meters, feet, yards). Continuing the example, if the MIA 20 flies in anupward manner 10 meters, the position sensor system 810 may output aposition signal indicating a location of (0, 0, 10). The position sensorsystem 810 may output position signals at predetermined time intervals,e.g., every second. To the extent that the position sensor system 810includes a GPS, the GPS signal may be an absolute position signal (e.g.,latitude and longitude) and may be blended with the output of anaccelerometer and/or gyroscope to more accurately estimate the positionof the MIA 20.

The MIA position determination module 812 monitors the position sensorsystem 810 to estimate a position of the MIA 20. In someimplementations, the MIA position determination module 812 receivesposition signals from the position sensor system 810 and determines theposition of the MIA 20 throughout the flight of the MIA 20. The MIAposition determination module 812 may determine a relative position ofthe MIA 20 or an estimated absolute position of the MIA 20 at a giventime. The output of the MIA 20 may include a time stamp, such that theoutput of the MIA position determination module 812 indicates a positionof the MIA 20 and a relative time when the position of the MIA 20 wasdetermined (e.g., (t, x, y, z)). For example, the MIA positiondetermination module outputs the following series of positions(0,0,0,0), (1, 0, 0, 8), (2, −3, 0, 8), (3, −6, 0, 8). In this example,the output of the MIA position determination module 812 indicates thatthe MIA 20 moved ten meters in an upward direction between time=0 secand time=1 sec, and then proceeded to move in an easterly direction overthe next two seconds. The output of the MIA position determinationmodule 812 may indicate relative positions at a given time or estimatedabsolute positions of the MIA at a given time. The MIA positiondetermination module 812 may output the time stamped positions of theMIA to the target position determination module 814, the tracking system600, and/or the MIA control module 816.

The target position determination module 814 determines a position ofthe target UWBT 802 with respect to MIA 20 at a given time. The positionmay be an estimate of the actual position of the target UWBT 802 withrespect to the MIA 20. In some implementations, the target positiondetermination module 814 controls the anchor UWBTs 800 a-c to determinethe position of the target UWBT 802. The target position determinationmodule 814 may command the anchor UWBTs 800 a-c to transmit respectivesignals to the target UWBT 802. In response to the signals, the targetUWBT 802 returns a corresponding response signal (or set of responsesignals) that is received by each of the anchor UWBTs 800 a-c. Each ofthe anchor UWBTs 800 a-c can pass the signal to the target positiondetermination module 814. The target position determination module 814can determine a roundtrip time for each instance of the response signalreceived via each respective anchor UWBT 800 from the target UWBT 802.Put another way, the target position determination module 814 candetermine the total amount of time that lapses between a respectiveanchor UWBT 800 sending the signal to the target UWBT 802 and therespective anchor UWBT 800 a receiving the response signal from thetarget UWBT 802. Based on the total amount of time to transmit andreceive the signal to/from the target UWBT 802, the target positiondetermination module 814 can estimate the distance between a respectiveanchor UWBT 800 and the target UWBT 802. Alternatively, the targetposition determination module 814 may instruct a single anchor UWBT 800to transmit a signal to the target UWBT 802. In response to receivingthe signal, the target UWBT 802 broadcasts a single signal that isreceived by each of the anchor UWBTs 800. In either scenario, the targetposition determination module 814 may utilize the amount of time thatlapsed between the transmission of the signal(s) and receipt of theresponse signal(s) at the respective anchor UWBTs 800 to estimate thedistance between each anchor UWBT 800 and the target UWBT 802. In thesecases, the target position determination module 814 may account for theamount of time for the target UWBT 802 to transmit the response signalin response to receiving a signal from the anchor UWBT(s) 800. Thetarget position determination module 814 may treat this time as aconstant, c, such that the constant c is subtracted from the totalroundtrip time. For each UWBT 800, the target position determinationmodule 814 can estimate the distance between the respective anchor UWBT800 and the target UWBT 802 by, for example, the following formula:

d=(t−c)v   (1)

Where t is the total round-trip time, c is the constant attributed tothe target UWBT 802 responding, and v is the speed at which the signalstravel.

Upon determining the distance between each anchor UWBT 800 and thetarget UWBT 802, the target position determination module 814 cancalculate a position of the target UWBT 802 with respect to the MIA 20.Given that the anchor UWBTs 800 are fixed along a body of the MIA, andtherefore, the distances between the anchor UWBTs are fixed, the targetposition determination module 814 can utilize these fixed points todetermine the position of the target with respect to the MIA 20 or aspecific point on the MIA 20. The position with respect to the MIA 20may be a three-dimensional vector that imparts distances in the x, y,and z directions with respect to the MIA 20. In some implementations,the target position determination module 814 determines the position ofthe target UWBT 802 with respect to the MIA 20 using triangulationtechniques. In determining the position of the target UWBT 802 relativeto the MIA 20, the target position determination module 814 may alsotake into account an orientation of the MIA 20, which may, for example,be determined by the MIA position determination module 812 using theposition sensor system 810 (e.g., accelerometers, etc.). For example, ifthe MIA 20 were to roll, pitch, or yaw about the x, y, and z-axes inreal space between times at which the relative position of the targetUWBT 802 is determined, the anchor UWBTS 800 a-c would also moverelative to the x, y, and z-axes, which may be accounted for whenrelating the relative motion or position of the target UWBT 802 back tothe x, y, z coordinate system. As an illustration, if the MIA 20 were toyaw 180 degrees about the z-axis between two times without relativemovement of the UWBT 802, the measured x and y distances would haveopposite signs between the two times (e.g., the UWBT 802 might changefrom being behind to in front of the MIA 20).

Using the target position with respect to the MIA 20, as well as theposition and orientation of the MIA 20 that is obtained from the MIAposition determination module 812, the target position determinationmodule 814 can determine an estimated location of the target UWBT 802.The estimated position of the target UWBT 802 may be relative to thestarting point of the MIA 20 (e.g., in the case only gyroscopes andaccelerometers are used) or an absolute location (e.g., if the positionsensor system 810 includes a GPS). The target position determinationmodule 814 may determine the location of the target UWBT 802 accordingto:

LOC_(Target)=POS_(MIA)+POS_(Target)   (2)

Where LOC_(Target) is either the relative or absolute location of thetarget UWBT 802, POS_(MIA) is the relative or absolute location of theMIA 20, and POS_(Target) is the position determined with respect to theMIA using triangulation or other suitable method.

In some implementations, the target position determination module 814can determine a direction of travel and/or velocity (which includes thedirection of travel) of the MIA 20 and the target UWBT 802 with respectto the MIA 20. In these implementations, the MIA position determinationmodule 812 can monitor the position of the MIA 20 at a first time and asecond time. Based on the position of the MIA 20 at the first time andat the second time, the MIA position determination module 812 candetermine the velocity of the MIA 20, where the velocity is athree-dimensional vector showing a magnitude of velocity with respect tox, y, and z axes. The amount of time between the first time and secondtime may be any suitable amount of time. Preferably, no more than twoseconds should pass between the first time and second time, so that thevelocity of the MIA 20 may be more accurately estimated.

Similarly, the target position determination module 814 can monitor theposition of the target T (i.e., the UWBT 802) relative to the MIA 20 ata first time and a second time. Based on the position of the UWBT 802 atthe first time and at the second time, the target position determinationmodule 814 can determine the velocity of the UWBT 802 relative to theMIA 20. The velocity of the UWBT 802 relative to the MIA 20 may be addedto the velocity of the MIA 20 to determine the velocity of the UWBT 802with respect to x, y, and z axes. Instead, or additionally, the velocityof the of the UWBT 802 relative to the MIA 20 may be determinedaccording to principles of the doppler effect by comparing a measuredfrequency of one or more of the signals 804 a, 804 b, 804 c to a defaultfrequency thereof (e.g., if no relative movement between the UWBT 802and the MIA 20 were to occur).

The target position determination module 814 can output the position ofthe target UWBT 802 with respect to the MIA 20, the location of thetarget UWBT 802, and/or the velocity of the target UWBT 802 to thetracking system 60, the MIA control module 816, and/or the imagingdevice movement mechanism 30. In some implementations, the trackingsystem 60 and/or the imaging device movement mechanism 30 may utilizethe position of the target UWBT 802 to adjust the orientation of theimaging device 100, such that the target may be observed in the field ofview of the imaging device 100. Similarly, the tracking system 60 and/orthe imaging device movement mechanism 30 may utilize the position of thetarget UWBT 802 to adjust a zoom setting of the imaging device 100. Forexample, if the location of the target UWBT 802 is relatively far away(e.g., more than 20 meters), the tracking system 600 and/or imagingdevice movement mechanism 300 may increase the zoom setting to betterobserve the target, T.

In some implementations, the MIA control module 816 controls themovement of the MIA 20. The MIA control module 816 may be programmedwith one or more routines that define a motion of the MIA 20 withrespect to a target (e.g., fixed distance from the target, circle thetarget, etc.), with respect to a starting location, or any othersuitable reference. To perform the routines, the MIA control module 816may utilize the location of the target UWBT 802 and/or the velocity ofthe target UWBT 802, and the routine being executed to control theflight path of the MIA 20. For example, if the MIA 20 is to keep a fixeddistance from the target, then the MIA control module 816 may mirror thevelocity of the target UWBT 802 to maintain a fixed distance from thetarget. The foregoing scenarios are provided for example only, as thetracking system 60, the MIA control module 816, and/or the imagingdevice movement mechanism 30 may utilize the position of the target UWBT802 with respect to the MIA 20, the location of the target UWBT 804, andthe velocity of the target UWBT 802 to control any number of settings ofthe imaging device 100 or movements of the MIA 20.

While shown as having three anchor UWBTs 800 a-c, the MIA 20 may includeany number of anchor UWBTs 800 greater than or equal to three.Furthermore, it is understood that the estimated position of the targetUWBT 802 with respect to the MIA 20, the estimated location of thetarget UWBT 802, and the estimated velocity of the target UWBT 802, asdetermined by the target position determination module 814, may beblended with outputs of other components of the MIA 20 to obtain anestimated location and/or velocity of the target, T.

FIG. 11C illustrates an example set of operations of a method 820 fordetermining a location of a target UWBT 802. The method 820 is describedas being performed by the components of FIG. 11B. It is appreciated thatthe method 820 may be performed by any suitable components of an MIA 20(or similar device) without departing from the scope of the disclosure.

At 822, the MIA position determination module 812 determines a positionof the MIA. The MIA position determination module 812 may receiveposition signals from the position sensor system 810. For example, theMIA position determination module 812 may obtain signals from anaccelerometer, a gyroscope, and/or a GPS system of the MIA 20 to obtaina position of the MIA 20 (e.g., x, y, and z coordinates). The positionmay be a relative position that is relative to a starting point of theMIA 20 or an absolute position (e.g., longitude, latitude, andaltitude). The orientation of the MIA 20 (e.g., roll, pitch, and yawabout x, y, and z axes) may also be determined, for example, relative tothe starting point and/or fixed coordinates.

At 824, one or more of the anchor UWBTs 800 transmits a signal to thetarget UWBT 802. In response to receiving a signal from the UWBTs 800,the target UWBT 802 returns a response signal to the anchor UWBTs 800.At 826, each anchor UWBT 800 receives the response signal from thetarget UWBT 802. It should be appreciated, that each anchor UWBT 800 mayreceive the response signal at a slightly different time. At 828, thetarget position determination module 814 may determine the roundtriptime for the received response signal for each of the UWBTs 800. Theroundtrip time for a respective anchor UWBT 800 may be the amount oftime from when the initial signal was transmitted to the target UWBT 802to when the respective anchor UWBT 800 received the response signal. At830, the target position determination module 814 may determine, foreach respective anchor UWBT 800, a distance between the respectiveanchor UWBT 800 and the target UWBT 802 based on the roundtrip time ofthe signal. As mentioned, the target position determination module 814may account for any lag attributed to the target UWBT 802 receiving thesignal and transmitting the response signal. In some implementations,the target position determination module 814 may utilize equation (1) todetermine the distance between the respective anchor UWBT 800 and thetarget UWBT 802, as provided above.

At 832, the target position determination module 814 determines aposition of the target UWBT 802 with respect to the MIA 20 based on thedetermined distances. The position of the target UWBT 802 relative tothe MIA 20 may also be based on the orientation of the MIA 20. Usingtriangulation techniques, the target position determination module 814can estimate a position of the target UWBT 802 with respect to the MIA20. As previously discussed, the target position determination module814 knows the positions of each of the anchor UWBTs with respect to theMIA. Using the positions of the anchor UWBTs 800 with respect to oneanother and the distances determined at 830, the target positiondetermination module 814 determines the position of the target UWBT 802with respect to the MIA 20, for example, using triangulation. At 834,the target position determination module 814 can optionally determine alocation of the target UWBT 802. The target position determinationmodule 814 can utilize the position of the MIA 20 and the position ofthe target UWBT 802 relative to the MIA 20 to determine the location ofthe target UWBT 802. In some implementations, the target positiondetermination module 814 utilizes equation (2) to determine the locationof the target UWBT 802. The location of the target UWBT 802 may be arelative location (e.g., with respect to a starting point of the MIA 20)or an absolute location (e.g., latitude, longitude, and height). At 836,the target position determination module 814 can output the determinedlocation of the target UWBT 802 and/or the position of the target UWBT802 with respect to the MIA 20. For example, the target positiondetermination module 814 may output one or both of the values to thetracking system 60, the MIA control module 816, and/or the imagingdevice movement mechanism 300 to control the operation of the MIA 20.

It should be appreciated that the outputted values may be regarded asestimates of location, position, and/or velocity. The methods 830, 850may be combined into a single method, whereby estimates of location,position, and/or velocity of the target, T may be output to thedownstream components of the MIA 20.

Improvements in relative position and velocity estimates may beadvantageous since the high-level output of the system may be noisyposition measurements and/or a relative range between beacons. By fusingthis output with gyroscope and accelerometer data in a sensor fusionframework, the system may be able to: a) increase frequency (internalmeasurement unit (IMU) data may be higher frequency than UWBmeasurements); b) reduce noise in position estimates; c) obtain accuratevelocity information (by fusion of position and acceleration (which is asecond derivative of position)); and d) reduce a delay in a positionestimate by synchronizing a time between IMU measurements (very lowlatency) and UWBT measurements such that any delay in providing the UWBTmeasurements may be eliminated.

System Architecture and Dataflow: Latency and Synchronization

As described with respect to FIG. 1, the MIA 20 may include the imagingdevice 100, such as a camera, which may be mounted to the movableplatform 40, such as a drone, via an imaging device movement mechanism30, such as a gimbal as described above. The imaging device movementmechanism 30 can also provide for active stabilization of the imagingdevice 100, and/or the captured images themselves can be stabilizedusing image shake correction techniques. The external device 50, such asthe MIA controller and user interface discussed above, may be utilizedfor controlling the MIA 20.

Referring to FIGS. 12A-12D, a tracking display system 1200 is configuredto simultaneously display on the display screen of the external device50 both video captured by the MIA 20 (e.g., the imaging device 100) andtracking information corresponding to the video being displayed (i.e.,tracking information corresponding to the video image frames). Thetracking and display system 1200 includes various modules that areincluded in and/or performed by various hardware components of themovable imaging system 10 (e.g., the MIA 20, the imaging device 100, theexternal device 50, the tracking system 60, etc.).

As shown in FIG. 12D, the display device 52 of the external device 50displays a tracking graphic S_(track) that visually identifies thesubject S being tracked with the tracking graphic S_(track) beingdisplayed on the external device 50 in a generally constant spatialrelationship relative to the subject S in successive video image framesF_(t), F_(t+1), F_(t+2) even as the subject S moves to differentpositions and/or changes sizes within successive video image frames. Thetracking graphic S_(track) may, for example, be a bounding box (e.g., anoutline of the bounding box) that is displayed around the subject S or aportion thereof in successive video image frames on the external device50.

FIG. 12A is a block diagram of the tracking display system 1200 thatincludes an imaging and tracking system 1210 and a display system 1220,which may implement a tracking display method 1200 a. The imaging andtracking system 1210 is operated by the MIA 20 (e.g., by the imagingdevice 100 and the tracking system 60, which itself may be operated bythe imaging device 100). The imaging and tracking system 1210 may alsobe referred to as a combined imaging and tracking system 100, 60. Thedisplay system 1220 may be included with and/or be operated by theexternal device 50. A video stream (e.g., a display video stream) and ametadata stream (e.g., subject stream) may be sent simultaneously fromthe imaging and tracking system 1210 (e.g., be provided as outputs fromthe MIA 20 via the link 55) to the display system 1220 (e.g., theexternal device 50).

The imaging and tracking system 1210 includes various modules forcapturing video image frames, storing the video image frames, generatingtracking information from the video image frames, and sending the videoimage frames and the tracking information to the display system 1220(e.g., from the MIA 20 via the link 55 to the external device 50). In araw video module 1212 (e.g., image capture module), raw video iscaptured by the imaging and tracking system 1210 (e.g., by the imagesensor 136 of the imaging device 100). In a video pipe module 1214, theraw video (or processed video) is sent or distributed to a storagemodule 1216, a tracking module 1218, and to the display system 1220.

The storage module 1216 receives a main video stream (e.g., highresolution video), which is stored as a video track along with acorresponding metadata track (e.g., having a time information and/orother metadata described previously) (e.g., by the electronic storage138 of the imaging device 100). A main video stream module 1216 a mayprocess the raw video from the video pipe module 1214 (e.g., convertingto an appropriate format and/or resolution) before being stored by thestorage module 1216.

The tracking module 1218 tracks a subject S (e.g., performs subjectfollowing), for example, by operating the tracking system 60. Algorithmsof the tracking system 60 (e.g., software programming containing thealgorithms) may be run on MIA 20, for example, by the imaging device100. The tracking module 1218 receives a secondary video stream (e.g.,low resolution video) and generates tracking information therefrom. Asecondary stream module 1218 a may process the raw video from the videopipe module 1214 (e.g., converting to an appropriate format and/orresolution) before being processed by the tracking module 1218. Thetracking information generated by the tracking module 1218 is sent as ametadata to the display system 1220. A metadata module 1219 (e.g.,operated by a MUX or multiplexer) may, for example, process the trackinginformation (e.g., converting to an appropriate metadata format, such aspertaining to the tracking display graphic Strack) to generate themetadata sent to the display system 1220. This tracking information(i.e., the output from the tracking algorithm running on the imagingdevice 100) may also be used by the MIA 20 for the physical (e.g.,actual) tracking of the subject S (or the target T) with the MIA 20and/or the imaging device 100.

The pipe module 1214 may send the display video stream directly to thedisplay system 1220. Alternatively, an intermediate module (not shown)may process the raw video (e.g., into an appropriate format and/orresolution) to be sent and received by the display system 1220.

Still referring to FIG. 12A, a block diagram of the display system 1220includes various modules for processing the display video stream and themetadata stream, and for displaying the video image frames and thetracking information. The display system 1220 may be considered anun-optimized display system, as the video stream and the trackinginformation (e.g., overlay of the tracking graphic Strack) may bedisplayed asynchronously, as discussed below.

A tracking overlay module 1222 (e.g., video module) of the displaysystem 1220 is provided with the metadata stream as an input, andprocesses the metadata stream to perform an overlay function for thedisplay device 52. The metadata stream may include tracking information,such as a location and size (or shape) of the subject S based on whichthe tracking overlay module 1222 generates the tracking graphicS_(track) for display (e.g., overlay on the video image frames on thedisplay device 52 of the external device 50). For example, the trackinggraphic S_(track) may be a bounding box that moves position and is sizedto be around the subject S as the subject S moves in successive videoimage frames.

The display video stream is provided as an input to a video decodingmodule 1224 (e.g., video decoder) of the display system 1220. The videodecoding module 1224 processes the display video stream for displayingvideo image frames (e.g., converts to an appropriate format and/orresolution for display on the display device 52 of the external device50).

A display module 1226 of the display system 1220 then displays both thevideo image frames and the tracking graphic S_(track). For example, thedisplay module may be or include the display device 52 of the externaldevice 50.

The tracking graphic S_(track) (e.g., tracking overlay) may be displayedasynchronously with the video image frames (e.g., decoded video) on thedisplay device 52. Delay between the display image stream and themetadata stream from the image and tracking system 1210 to the displaysystem 1220 may occur in various circumstances. For example, when theimage and tracking system 1210 is operated entirely by the imagingdevice 100, the processor 132 of the imaging device 100 may executethose modules related to video (e.g., capture, processing, storage, andtransfer of modules 1212, 1214, 1216, 1216 a), while also executingthose modules related to tracking (e.g., generation, processing andtransfer in modules 1218, 1218 a, and 1219). The processor 132 may, insome circumstances, lack sufficient processing powerful, for example dueto size and power constraints, to perform both the video-relatedfunctions, which may be the primary task of the processor 132, and thetracking-related functions or tasks. Thus, running the tracking system60 on the imaging device 100 may be relatively slow and introduce delaybetween the display video stream and the metadata stream. In turn, thedisplay system 1220 may display the video image frames and the trackinggraphic S_(track) out of time sync with each other. Alternative displaysystems 1220′ and 1220″ are discussed below, which may reduce and/oreliminate this delay between the successive video frames and thetracking information to the user (e.g., with the bounding box around thesubject S or other metadata on the display device 52 of the externaldevice 50).

Since a framerate of the display video stream (e.g., from the imagingdevice 100) may be higher than a framerate of the metadata stream (e.g.,from the tracking system 60), for example, due to the high processingload of the processor 132 (e.g., performing both the video and trackingrelated functions, as discussed above), or high processing load ofanother processor of the MIA 20 otherwise performing the trackingfunctions. As a result, display of the tracking graphic S_(track) (e.g.,the bounding box around the subject S), but or other displayed metadataassociated with the video image frames can be disjointed and have astuttering look and feel to the user. For example, if the video imageframes were displayed at 60 frames per second (fps) and the trackinggraphic S_(track) would displayed at 12 fps, the tracking graphicS_(track) would move only once for every five video image frames and,thus, stutter (e.g., lag, jump, etc.) as the subject S moves todifferent positions within successive video image frames.

In order to smooth the appearance of motion of the tracking graphicS_(track) (e.g., the bounding box), motion interpolation techniques maybe used to move the tracking graphic S_(track) for video image framesfor which there is no metadata (i.e., due to the different frame rates).The motion interpolation techniques that are applied by the trackingsystem 60 to the subject S (or the target T) for determining the ROI insuccessive video images frames discussed above can also be applied tomotion of the tracking graphic S_(track) (e.g., the bounding box).Similar to determining the ROI of future video image frames, this motioninterpolation of the tracking graphic S_(track) may be done by using amotion model based on one or more previous locations of the boundingbox. The motion model can be based on fitting a curve (polynomial,spline), a recursive filter (such as an EKF), or some other method, asdescribed above. For example, the display system 1220 may include amotion modeling module 1222 a that implements the motion interpolationtechnique. More particularly, the motion modeling module 1222 adetermines, based on preceding location information (i.e., knownlocation information corresponding in time to one or more previous videoimage frames), determine modeled location information for video imageframes having insufficient (e.g., no) known location informationcorresponding thereto. Based on the modeled location information, thetracking overlay module 1222, the motion modeling module 1222 a, oranother module generates the tracking graphic S_(track) that the displaymodule 1226 then displays for those video image frames for which themodeled location information is determined and/or insufficient (or no)known location information is available. The motion modeling module 1222a may also not be included, or functions of the emotion modeling module1222 a may be performed, for example, by the detect and identify module1218 and/or the metadata module 1219.

Referring to FIGS. 12B and 12C, block diagrams of alternative displaysystems 1220′ and 1220″ that may be used with the imaging and trackingsystem 1210 of the tracking and display system 1100.

Referring to FIG. 12B, a block diagram of various modules of anoptimized display system 1220′ with a low-latency redundant detectand/or identify module 1228′ are shown according to an implementation.Rather than receive separate video and metadata streams from the imagingand tracking system 1210, the display system 1220′ instead receives andprocesses the video stream (e.g., from the video pipe module 1214) togenerate the metadata locally (e.g., the tracking graphic S_(track).That is, in order to reduce latency for the display device 52 to theuser on the display system 1220″ of the external device 50, the videostream may be fed to an input of a redundant detect and/or identifymodule 1228′ on the external device 50 (e.g., being redundant to thedetect and identify module 1218 of the imaging and tracking system 1210of the MIA 20). The display system 1220′ includes the video overlaymodule 1222 and the video decode module 1224, as described previously,but which instead receive the metadata stream and the video streamlocally from the redundant detect and identify module 1228′. Thetracking efficiency and robustness may additionally be improved by usingthe motion estimates of the MIA 20 and the estimated position andvelocity of the target T as discussed above. In this implementation, themetadata stream may not be needed by the display system 1220′ since themetadata is determined by the redundant detect and/or identify module1228′. The benefit of this is that there may be a lower latency due tothe display system 1220′ having more processing power and potentiallydedicated image processing hardware that can execute the algorithm morequickly without requiring propagation of the metadata.

FIG. 12C is a block diagram of various modules of an optimized displaysystem 1220″ using synchronization techniques according to animplementation. As described previously with respect to the imaging andtracking system 1210, the detection and identification functions may addadditional latency to the system, and the tracking overlay or otherdetection/identification metadata output may trail the video framesconstituting the video stream.

In the optimized display system 1220″, a frame ID for each image frameis associated with the image frame and the metadata of associated withthat image frame, and is sent by the imaging and tracking system 1210with the video stream and with the metadata stream. The system may bepipelined by forwarding the video before the detection algorithm is run.In this optimized display system 1220″, both the video stream and themetadata stream arrive at a video and metadata sync module 1228″ of thedisplay system 1220″, which is able to use the frame ID, which isassociated with both the image frames of the video stream and themetadata of the metadata stream, to synchronize the video metadata withthe video frames. The metadata sync module 1228″ may, for example,employ an elastic buffer that allows the image frames and the metadatato be sync but display such image frames and metadata with a slight lagbehind when first received by the display system 1220″. The result isthat the display device 52 can present the video with its trackingoverlay in a synchronized manner while minimizing latency. Use ofpipelining can minimize the overall latency, and the synchronizingmatches the video to the overlay.

The various systems 1210, 1220 and the modules thereof by one or more ofthe tracking system 60, the MIA 20, the external device 50, the imagingdevice 100 and/or various hardware components thereof (e.g., processors,memories, and/or communications components).

Referring to FIG. 12E, a set of operations of the method 1200 a aredescribed for simultaneously displaying a video image stream andmetadata.

At 1210 a, successive images forming a video stream are captured, forexample, by the imaging device 100.

At 1220 a, metadata is determined from the successive images. Forexample, the successive images may be processed to determine themetadata as a location and/or a size of a target T therein and/or abounding box within which the target T is positioned. The metadata maybe determined from the successive images remote from a display device,for example, by the MIA 20 (e.g., by the imaging device 20), and thensent as a metadata stream via the link 55 to the external device 50.Alternatively, the metadata may be determined from the successive imageframes locally to the display device, for example, by the externaldevice 50 from a video stream sent to the external device 50 via thelink 55.

At 1222 a, a common time stamp is associated with each of the successiveimages and the metadata associated therewith.

At 1224 a, the metadata may be interpolated according to the successiveimages. For example, the metadata may be generated at a lesser frequencythan a frame rate at which the successive images are captured (e.g.,metadata may be generated at 12 Hz, while the frame rate is 60 fps),such that metadata is associated with every fifth image. For imageframes between times at which the metadata is generated, the metadatamay be interpolated, for example, according to a motion model (e.g.,linear fitting, curve fitting (e.g., polynomial, spline), or a recursivefilter).

At 1230 a, the successive images and a graphic associated with themetadata are simultaneously displayed by a display device, such as thedisplay device 52 of the external device 50. The graphic associated withthe metadata may, for example, be a bounding box that is displayedgenerally around the target T (e.g., being overlaid with the successiveimages).

At 1232 a, the successive images and the metadata having the same timestamp are displayed simultaneously. For example, a time buffer may beapplied by which the successive images and the metadata associatedtherewith, which may be received by the display device 52 at differenttimes, may be synced to each other according to the time stamp.Alternatively, the metadata may be generated locally to the displaydevice 52, which may result in lower latency, without the metadatastream being transferred in parallel with the video stream via the link55.

Operations 1210 a to 1230 a (or 1232 a) are then repeated for stillfurther successive image frames.

FIGS. 13-21 are block diagrams that illustrate several architecturesthat may be utilized to implement, detect, identify, and drawfunctionalities as implemented by different components of the movableimaging system 10.

FIG. 13 is a block diagram that illustrates an architecture according toan implementation showing the imaging device 100, movable platform 40,and the external device 50 along with the functionalities of detect,identify, track, draw, select, and synchronize.

FIG. 14 is a block diagram illustrating the detect and identifyfunctions, which may constitute an image processing block with a frameinput and a frame-relative subject stream which contains data related toone or more subjects within the video image frame.

FIG. 15 is a block diagram illustrating the track function, which mayuse the current attitude and position of the movable platform 40, asubject stream, and a desired subject stream to compute a desiredtrajectory setpoint.

FIG. 16 is a block diagram illustrating the synchronize function (e.g.,“sync”), which may be used to implement an elastic buffer to partiallyor fully align the subject and video stream, although this module maynot be necessary if a synchronous architecture is chosen or anasynchronous architecture is chosen with acceptable video and subjectlayer misalignment.

FIG. 17 is a block diagram illustrating the select function that mayutilize a user input which is a screen position and the subject streamto compute a desired subject. The purpose of this module is to be ableto permit a “touch to select” of a subject on the screen.

FIG. 18 is a block diagram illustrating a draw function that may use thedesired subject stream or all subjects and the subject stream to computean on-screen display (OSD) overlay layer to be drawn on top of the videostream. The purpose of this module is to visualize the detected andidentified subjects on the UI of the external device 50.

Various alternate solutions can be also provided. For example, FIG. 19is a block diagram illustrating a distributed solution. In thissolution, the detect and identify functions are implemented on theimaging device 100, the track function is implemented on the movableplatform 40, and the draw, select, and synchronize functions areimplemented on the external device 50.

FIG. 20 is a block diagram illustrating a camera-centric solution inwhich the sync and draw functions are implemented on the imaging device100 instead of the external device 50.

FIG. 21 is a block diagram illustrating a controller-centric solution inwhich all functionality other than tracking is implemented in theexternal device 50. In this design, the sync function is not requiredsince this solution is completely synchronous.

In a distributed solution design, the detect and identify modules areimplemented and optimized for the imaging device 100. Support may beadded to handle a subject stream. The subject stream may contain, e.g.,a subject ID, a subject location in the image in, e.g., pixelcoordinates, a bounding box around a subject in pixels, a distance to asubject (in an absolute distance or up to scale). The video pipeline maybe optimized for low latency and the low resolution video (LRV) streammay be optimized as input for the detect and identify modules. Ametadata muxer may be configured to handle a subject stream and to writethe subject stream to, e.g., a session mp4 text track. The muxer may beconfigured to write the subject stream out to a USB/MTP interface.

In the movable platform 40 of the distributed solution design, a USB/MTPinterface may be configured to handle a subject stream. A universalasynchronous receiver/transmitter (UART) or other interface may beconfigured to push the subject stream and desired subject to a flightcontroller subsystem. A drone command and control (C&C) interface may beconfigured to handle the desired subject stream. It is possible toimplement the sync module before the muxer block, but this design ischosen to implement the sync, if needed, either further up or downstreamto minimize the total latency in the system.

In a flight controller of the movable platform 40 for the distributedsolution design, the tracking system may be implemented using thesubject stream and the desired subject to compute the desiredtrajectory. A desired MIA 20 trajectory setpoint may be parameterizedby, e.g., position, velocity, acceleration, or attitude of the MIA 20.The UART or other interface may be configured to handle the subjectstream and the desired subject. A state machine may be configured toimplement a tracking state.

The external device 50 in the distributed solution design may beutilized to implement the select, draw, and identify functions as wellas any further required UI functions. Optionally, the sync function maybe implemented by the external device 50 in order to align the videostream with the subject stream. The native system may be configured toreceive the subject stream over the interface of the movable platform 40and pass it to an application layer. The external device 50 additionallymay send the desired subject to the movable platform 40, while anapplication on the external device 50 may be configured to handle thesubject stream and desired subject as well.

The following description is focused on the differences between thecamera-centric and distributed solutions. The processor associated withthe movable platform 40 and the flight controller implementations neednot change. The imaging device 100 in the camera-centric solution issimilar to that of the distributed solution with the addition of syncand draw modules being moved to a position before an HDMI or high speedimage/data interface.

The following description is focused on the differences between thecontrol-centric and distributed solutions. The processor associated withthe movable platform 40 and flight controller implementations need notchange. The imaging device 100 of the control-centric solution may havean added feature that extends the USB/MTP interface to receive a subjectstream and mux into the session mp4 text track. In this design, theexternal device 50 may have the detect and identify functionsimplemented natively, and the draw function may be implemented nativelyas well. The sync function is removed because the design is synchronous.

Where certain elements of these implementations may be partially orfully implemented using known components, only those portions of suchknown components that are necessary for an understanding of the presentdisclosure have been described, and detailed descriptions of otherportions of such known components have been omitted so as not to obscurethe disclosure.

In the present specification, an implementation showing a singularcomponent should not be considered limiting; rather, the disclosure isintended to encompass other implementations including a plurality of thesame component, and vice-versa, unless explicitly stated otherwiseherein.

Further, the present disclosure encompasses present and future knownequivalents to the components referred to herein by way of illustration.

As used herein, the term “bus” is meant generally to denote all types ofinterconnection or communication architecture that may be used tocommunicate data between two or more entities. The “bus” could beoptical, wireless, infrared or another type of communication medium. Theexact topology of the bus could be for example standard “bus,”hierarchical bus, network-on-chip, address-event-representation (AER)connection, or other type of communication topology used for accessing,e.g., different memories in a system.

As used herein, the terms “computer,” “computing device,” and“computerized device” include, but are not limited to, personalcomputers (PCs) and minicomputers, whether desktop, laptop, orotherwise, mainframe computers, workstations, servers, personal digitalassistants (PDAs), handheld computers, embedded computers, programmablelogic device, personal communicators, tablet computers, portablenavigation aids, J2ME equipped devices, cellular telephones, smartphones, personal integrated communication or entertainment devices, orliterally any other device capable of executing a set of instructions.

As used herein, the term “computer program” or “software” is meant toinclude any sequence or human or machine cognizable steps which performa function. Such program may be rendered in virtually any programminglanguage or environment including, for example, C/C++, C#, Fortran,COBOL, MATLAB™, PASCAL, Python, assembly language, markup languages(e.g., HTML, SGML, XML, VoXML), as well as object-oriented environmentssuch as the Common Object Request Broker Architecture (CORBA), Java™(including J2ME, Java Beans), Binary Runtime Environment (e.g., BREW).

As used herein, the terms “connection,” “link,” “transmission channel,”“delay line,” and “wireless” mean a causal link between any two or moreentities (whether physical or logical/virtual) which enables informationexchange between the entities.

As used herein, the terms “integrated circuit,” “chip,” and “IC” aremeant to refer to an electronic circuit manufactured by the patterneddiffusion of trace elements into the surface of a thin substrate ofsemiconductor material. By way of non-limiting example, integratedcircuits may include field programmable gate arrays (e.g., FPGAs), aprogrammable logic device (PLD), reconfigurable computer fabrics (RCFs),systems on a chip (SoC), application-specific integrated circuits(ASICs), and/or other types of integrated circuits.

As used herein, the term “memory” includes any type of integratedcircuit or other storage device adapted for storing digital dataincluding, without limitation, ROM, PROM, EEPROM, DRAM, Mobile DRAM,SDRAM, DDR/2 SDRAM, EDO/FPMS, RLDRAM, SRAM, “flash” memory (e.g.,NAND/NOR), memristor memory, and PSRAM.

As used herein, the terms “microprocessor” and “digital processor” aremeant generally to include digital processing devices. By way ofnon-limiting example, digital processing devices may include one or moreof digital signal processors (DSPs), reduced instruction set computers(RISC), general-purpose (CISC) processors, microprocessors, gate arrays(e.g., field programmable gate arrays (FPGAs)), PLDs, reconfigurablecomputer fabrics (RCFs), array processors, secure microprocessors,application-specific integrated circuits (ASICs), and/or other digitalprocessing devices. Such digital processors may be contained on a singleunitary IC die, or distributed across multiple components.

As used herein, the term “network interface” refers to any signal, data,and/or software interface with a component, network, and/or process. Byway of non-limiting example, a network interface may include one or moreof FireWire (e.g., FW400, FW110, and/or other variation.), USB (e.g.,USB2), Ethernet (e.g., 10/100, 10/100/1000 (Gigabit Ethernet), 10-Gig-E,and/or other Ethernet implementations), MoCA, Coaxsys (e.g., TVnet™),radio frequency tuner (e.g., in-band or OOB, cable modem, and/or otherprotocol), Wi-Fi (802.11), WiMAX (802.16), PAN (e.g., 802.15), cellular(e.g., 3G, LTE/LTE-A/TD-LTE, GSM, and/or other cellular technology),IrDA families, and/or other network interfaces.

As used herein, the term “Wi-Fi” includes one or more of IEEE-Std.802.11, variants of IEEE-Std. 802.11, standards related to IEEE-Std.802.11 (e.g., 802.11 a/b/g/n/s/v), and/or other wireless standards.

As used herein, the term “wireless” means any wireless signal, data,communication, and/or other wireless interface. By way of non-limitingexample, a wireless interface may include one or more of Wi-Fi,Bluetooth, 3G (3GPP/3GPP2), HSDPA/HSUPA, TDMA, CDMA (e.g., IS-95A,WCDMA, and/or other wireless technology), FHSS, DSSS, GSM, PAN/802.15,WiMAX (802.16), 802.20, narrowband/FDMA, OFDM, PCS/DCS,LTE/LTE-A/TD-LTE, analog cellular, CDPD, satellite systems, millimeterwave or microwave systems, acoustic, infrared (i.e., IrDA), and/or otherwireless interfaces.

As used herein, the term “robot” may be used to describe an autonomousdevice, autonomous vehicle, computer, artificial intelligence (AI)agent, surveillance system or device, control system or device, and/orother computerized device capable of autonomous operation.

As used herein, the term “camera” may be used to refer to any imagingdevice or sensor configured to capture, record, and/or convey stilland/or video imagery which may be sensitive to visible parts of theelectromagnetic spectrum, invisible parts of the electromagneticspectrum (e.g., infrared, ultraviolet), and/or other energy (e.g.,pressure waves).

While certain aspects of the technology are described in terms of aspecific sequence of steps of a method, these descriptions are onlyillustrative of the broader methods of the disclosure and may bemodified as required by the particular application. Certain steps may berendered unnecessary or optional under certain circumstances.Additionally, certain steps or functionality may be added to thedisclosed implementations, or the order of performance of two or moresteps permuted. All such variations are considered to be encompassedwithin the disclosure.

While the above detailed description has shown, described, and pointedout novel features of the disclosure as applied to variousimplementations, it will be understood that various omissions,substitutions, and changes in the form and details of the devices orprocesses illustrated may be made by those skilled in the art withoutdeparting from the disclosure. The foregoing description is in no waymeant to be limiting, but rather should be taken as illustrative of thegeneral principles of the technologies.

What is claimed is:
 1. A method for controlling a movable imagingassembly (MIA) having a movable platform and an imaging device coupledto and movable relative to the movable platform, the method comprising:receiving user inputs defining an MIA position relative to a target anda frame position of the target within image frames captured by theimaging device, the user inputs including a horizontal distance, acircumferential position, and a horizontal distance that define the MIAposition, and including a horizontal frame position and a vertical frameposition that define the frame position; predicting a future position ofthe target for a future time; and moving the MIA to be in the MIAposition relative to the target at the future time and moving theimaging device relative to the movable platform for the target to be inthe frame position for an image frame captured at the future time.
 2. Amethod for controlling a movable imaging assembly (MIA) having a movableplatform and an imaging device coupled to and movable relative to themovable platform, the method comprising: receiving user inputs definingan MIA position relative to a target and a frame position of the targetwithin image frames captured by the imaging device; predicting a futureposition of the target for a future time; and moving the MIA to be inthe MIA position relative to the target at the future time and movingthe imaging device relative to the movable platform for the target to bein the frame position for an image frame captured at the future time. 3.The method according to claim 2, wherein the MIA position defined by theuser inputs includes one or more of a horizontal distance, acircumferential position, or a horizontal distance between the MIA andthe target.
 4. The method according to claim 3, wherein the user inputsinclude each of the horizontal distance, the circumferential position,and the horizontal distance between the MIA and the target.
 5. Themethod according to claim 4, wherein the user inputs include fixedvalues for the horizontal distance, the circumferential position, andthe horizontal distance are fixed values defined by the user inputs. 6.The method according to claim 3, wherein the user inputs include a fixedvalue for at least one of the horizontal distance, the circumferentialposition, or the horizontal distance, and the user inputs include achoreographed flight pattern by which another of the horizontaldistance, the circumferential position, or the horizontal distance isvaried.
 7. The method according to claim 3, wherein the user inputsinclude a frame of reference by which the circumferential position isdefined according to one of fixed or dependent on a trajectory of thetarget.
 8. The method according to claim 2, wherein the user inputsinclude one or more of a horizontal frame position or a vertical frameposition of the target in the image frame.
 9. The method according toclaim 8, wherein the user inputs include a fixed value for at least oneof the horizontal frame position of the vertical frame position.
 10. Themethod according to claim 8, wherein the one or more of the horizontalframe position or the vertical frame position form a region or abounding box within the image frame.
 11. The method according to claim8, wherein the horizontal frame position or the vertical frame positionis one of restricted or guided according to a width of the image frame.12. A method for controlling a movable imaging assembly (MIA) having amovable platform and an imaging device coupled to and movable relativeto the movable platform, the method comprising: predicting a future zoneposition at a future time for one or more restricted zones that aredefined relative to a target and in which the MIA is restricted fromtraveling; predicting whether intended flight instructions will resultin the MIA traveling into the one or more restricted zones at the futuretime; and controlling the MIA according to the intended flightinstructions if the MIA is predicted to not travel into the one or morerestricted zones with the intended flight instructions, or controllingthe MIA according to modified flight instructions if the MIA ispredicted to travel into the one or more restricted zones with theintended flight instructions.
 13. The method according to claim 12,wherein the modified flight instructions are predicted to not result inthe MIA traveling into the one or more restricted zones at the futuretime.
 14. The method according to claim 12, wherein predicting thefuture zone position of the one or more restricted zones includespredicting a future target position of the target.
 15. The methodaccording to claim 14, wherein the future target position of the targetis predicted according to past target positions of the target.
 16. Themethod according to claim 15, wherein the past target positions aredetermined according to past image frames captured by the imagingdevice.
 17. The method according to claim 12, wherein the one or morerestricted zones include one or more of a first restricted zone definedby a maximum distance from the target, a second restricted zone definedby a minimum distance from the target, and a third restricted zonedefined above the target.
 18. The method according to claim 17, whereinthe one or more restricted zones include the first restricted zone thatis outside the maximum distance and the second restricted zone that isinside the minimum distance.
 19. The method according to claim 18,wherein the one or more restricted zones include the third restrictedzone that is a conical region above the target.
 20. The method accordingto claim 17, wherein the one or more restricted zones include the thirdrestricted zone that is defined according to a field of view of theimaging device containing the target.